A bit of fun with Cognitive Services

A bit of fun with Cognitive Services

On a chilly spring morning I thought I would just put up a quick post on something a bit fun. If you haven’t checked out Microsoft Cognitive Services apps, take the time to have a play, they’re both fun and innovative.

I wanted to have a bit of a play myself so I jumped onto my Azure portal and added a Cognitive Services Account. I thought I’d start with something I could play with. something my brother and I had endless amounts of fun with back in the 1990’s, Text to Speech.

The Bing Speech API (preview) being the weapon of choice. So I headed to the TTS Github repo and wouldn’t you know it, no PowerShell examples. I can understand though, it’s not always something that developers would go to. So I thought I’d take my recent study of python and put it to good use, and try to translate the python script into PowerShell. It worked quite well, in about 15 minutes I was playing with different accents.

Give this one a shot.


$SayWhat = Read-host -Prompt "What would you like me to say? `n`r"
$Language = 'en-au'
$Gender = 'Female'
$voice = '(en-AU, Catherine)'

#enter your API key here
$apiKey = ""

#build up headers to request access token
$params = ""
$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$headers.Add("Ocp-Apim-Subscription-Key", $apiKey)

$AccessTokenUri = "https://api.cognitive.microsoft.com/sts/v1.0/issueToken";
$AccessTokenHost = "api.cognitive.microsoft.com"
$path = "/sts/v1.0/issueToken"

# Connect to server to get the Access Token

$accesstoken = Invoke-RestMethod -Uri $AccessTokenUri -Method POST -Headers $headers

$body = "<speak version='1.0' xml:lang='$language'><voice xml:lang='$language' xml:gender='$gender' name='Microsoft Server Speech Text to Speech Voice $voice'>$SayWhat</voice></speak>"

#Build headers for TTS API

$headers = New-Object "System.Collections.Generic.Dictionary[[String],[String]]"
$headers.Add("Content-type", "application/ssml+xml")
$headers.Add("X-Microsoft-OutputFormat", "riff-16khz-16bit-mono-pcm")
$headers.Add("Authorization", "Bearer $accesstoken")
$headers.Add("X-Search-AppId", "07D3234E49CE426DAA29772419F436CA")
$headers.Add("X-Search-ClientID", "1ECFAE91408841A480F00935DC390960")
$headers.Add("User-Agent", "TTSForPowerShell")

#Connect to server to synthesize the wave and write to Wav file
$response = Invoke-RestMethod -Uri 'https://speech.platform.bing.com/synthesize' -Method POST -Headers $headers -Body $body -OutFile "$env:temp\TTS_API_Test.wav"
#Play the sound
(New-Object Media.SoundPlayer "$env:temp\TTS_API_Test.wav").PlaySync()

You can find all the voices and locales here.

I chose the en-au voice, because I wanted to see how bad the Aussie accent is, and yes it’s not great and they don’t have a male voice *raises hand*.

The only thing about these is the voices sound too nice, where is the geeky nasal-laden mid-toned voice of the IT guy *raises hand again* ?

Give it a go, and have some fun, and mind the cuss words.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s