top of page
Screen Shot 2021-11-05 at 9.34.39 am.png

Online Speech Tools 

Speech to Text and Text to Speech


Speech to Text

Text to Speech


This tool allows you to dictate your plans, goals, descriptions or reminders.

Enjoy the flexibility to record information at any location with your mobile phone. 

Save and send.


Reduce paperwork and share files across devices (printing option is available).

Add punctuation when recording and use spell check with Grammarly.

Use different language option.


  1. Start dictation.

  2. Speak clearly and add punctuation (using words).

  3. Clear content if you want to start again. 

  4. Save as a .txt or a .doc.

  5. Send to your email.


Text to Speech


Conveniently enables text to be played with a voice generator in English, Spanish, German, Italian, French or Russia language. 

Different voice options available.

Listen to the converted recording to practice intonations. 

Choose a volume, speed and pitch preference.


Save as a wav. file.


  1. Type or copy and paste text into the box

  2. Choose language. 

  3. Play. 

  4. Change the volume, speed and pitch, 

  5. Clear content if you want to start again. 


Voice Clarity

Establish individuality with your voice.

Record and replay speech signals.

Practice pitch and analyse with the fundamental frequency.

Save speech patterns to practice clear speech. 

Check vowel quality (sound frequency of formants).



  1. Record dialogue segments of your speech and stop.

  2. Click the screen to use the green line as a guide to the starting point in playing the recording. 

  3. Inspect spectrogram for the first dark band (first formant; set of adjacent harmonics at the base of the graph from a boost in resonance).

  4. Examine the mean fundamental frequency.

  5. Peak pick (highest point) fundamental frequency to determine the pitch.

  6. Practice delivering a voice with warmth and clarity.

  7. Look for phonetic stress marks to practice higher and lower pitches of vowel sounds.

  8. Analyse rhythm for a consistent harmonising pattern in English speech.


Frequency and Fundamental frequency 


Vibration and Resonance averages ("Voice depends on vocal fold vibration and resonance", n.d.)

Men: 110 - 155 Hz (lower pitch)

Women: 180 to 220 Hz (medium pitch)

For warmth and truthful voices

  • Men: 100 – 200 Hz.

  • Women:  up to 300 Hz.


Approximate lowest resonance frequency is fundamental frequency (Siupsinskiene & Lycke, 2011).

  • Men: 85 to 180 Hz  

  • Women:  165 to 255 Hz





Higher pitch 


Use a rise to a higher pitch with open questions, closed questions or closed sentences (Collins & Mees, 2003).

Joyful pitch with indignation at beginning of sentence and end of the sentence will facilitate audience comprehension (Mozziconacci & Hermes 1999). 


Theses higher pitches improve clarity in the English language.


Higher pitches range in fundamental frequency contours.



Indignation                           (fₒ)  150 - 315 Hz 

Joy                                         (fₒ)  150 - 250 Hz


Joy, happiness and confidence emotions influence pitch and should be used when clarifying information and emphasising a concept to make the experience more memorable (Dunn & Schweitzer 2005).


Lower pitch 


Use a fall in intonation pattern with statements or commands (Collins & Mees 2003).


Vowel sounds

Long repeated segment "the period" (low repetition frequency) has a low pitch, short repeated segment  (high repetition frequency) has a high pitch (Schnupp, Nelken & King, 2011).

The higher the vowel, the lower the first formant.
High [i]~[u]                           ( F1 )   280 ~ 310   Hz    (fee) - (foot)
Mid-high [I]~[U]                   ( F1 )   400 ~ 450  Hz     (fit) - (boot)
Mid-low [ɛ]~[ɔ ]                   ( F1 )  550 ~ 590   Hz     (fate) - (saw)
Low [æ]~[ a]                         ( F1 )  690 ~ 710            (cat) - (father)
Listen to the vowel sounds


Approximate formant frequencies in American English vowels (Carr, Durand & Ewen, 2005: University of Arizona, 2004)



Rhythm and Pausing


Plan to use longer stretches of speech for better fluency (Paradis, 2009). Deliver a voice with more warmth by pausing.

Reach for clearer pitches by focusing on truthful thoughts (Ekman, O'Sullivan, Friesen and Scherer, 1991). Present a speech more clearly for the audience with pauses to give the audience time for cognitive acceptance.

For international presentations deliver the speech at approximately 100 words per minute (Gerver 1969, Seleskovitch 1978 and Lederer 1981; as cited in Chang 2005. p6.).

Establish a rhythm for listeners to process auditory input efficiently (seen on spectrogram with regularities in duration, peaks, phonological vowel sounds. syllable quantity and quality (Steiner, 2004).




WASP - Waveform, Spectrogram and Pitch Display; Mark Huckvale University College London


Voice loudness 


The mean most comfortable listening level (MCL) is 42.7 (dB)  decibels Hearing Level (Franklin, Thelin, Nabelek and Burchfield, 2006).


Louder voices can enhance activation states of formality, indignant, interest, stress and happiness.

Softer voices enhance activation states of apology, boredom, intimacy, relaxation and sadness (Gussenhoven, 2002).



Words per Minute

Speech in minutes: Speech calculator: WHOISGUARD, INC.

Enter the number of words and convert to words per minute.

Change the choice in the average reading speed (100, 130, 160 words per minute).

Press ENTER for the approximate calculated time to deliver your speech.


Online Clock

Online Clock

Time yourself practising your presentation. 

Change the display size (small, medium, large or extra large).

Red numerals with choice of background colour (blue, black, silver, green or orange).

Choose between options of time, stopwatch, countdown or analogue clock.

Set alerts for timing tasks in the presentation.

Set your phone to never auto lock during the practice.

Use counter as a handheld tally counter.




Carr, P., Durand, J., & Ewen, C. J. (2005). Headhood, elements, specification and contrastivity. Philadelphia: John Benjamins Publishing Company.


Chang, C. (2005). Directionality in Chinese/English simultaneous interpreting: Impact on performance and strategy use. Unpublished doctoral dissertation, The University of Texas at Austin - Austin.


Collins, B., & Mees I. M. (2003). Practical phonetics and phonology: A research book for students. (2nd ed.). New York: Routledge.


Dunn, J. R., & Schweitzer, M. E. (2005). Feeling and believing: the influence of emotion on trust. Journal of Personality and Social Psychology, 88(5), 736–748. doi: 10.1037/0022-3514.88.5.736


Ekman, P., O'Sullivan, M., Friesen, W. V., & Scherer, K. R. (1991). Face, voice and body in detecting deception. Journal of Nonverbal Behavior, 15(2). Retrieved from


Franklin, C. A. Jr., Thelin, J. W., Nabelek, A. K., & Burchfield, S. B. (2006). The effect of speech presentation level on acceptance of background noise in listeners with normal hearing. Journal of the American Academy of Audiology, (17), 141–146. doi: 10.3766/jaaa.17.2.6


Gussenhoven, C. (2002). Intonation and biology. In H. Jacobs, & W. L. Wetzels (Eds.), Liber Amicorum Bernard Bichakjian, (pp. 59–82).


Mozziconacci, S. J., & Hermes, D. J. (1999). Role of intonation patterns in conveying emotion in speech. In J. J. Ohala, Y. Hasegawa, M. Ohala, D.  Granville & A. C. Bailey (Eds.). Paper presented at the Proceedings of the 14th International Congress of Phonetic Sciences, San Francisco, CA, USA. August 1-7, 1999 (pp. 2001-2004). Berkely: Linguistic Department; University of California, 


Paradis, M. (2009). Declarative and procedural determinants of second languages. Amsterdam: John Benjamins.


Schnupp, J., Nelken, I.,  & King, A. (2011). Auditory neuroscience: Making sense of sound. Cambridge, MAMassachusetts Institute of Technology Press.


Siupsinskiene, N., & Lycke, H. (2011). Effects of vocal training on singing and speaking voice characteristics in vocally healthy adults and children based on choral and nonchoral data. Journal of Voice, 25, e177–189. doi:10.1016/j.jvoice.2010.03.010


Steiner, I. (2004). 5th European Masters in Language and Speech Summer School Institute for Communications Research & Phonetics, University of Bonn.: Tutorial 5, Analyzing speech rhythm. Retrieved from


The University of Arizona (2004). Retrieved October 19, 2017, from the University of Arizona, Tucson, University of Arizona The University of Arizona, Web site


Voice depends on vocal fold vibration and resonance, (n.d.) Retrieved October 19, 2017, from The Voice Foundation, Philadelphia, The Voice Foundation Web site:



Measurements are only an approximation.

bottom of page