I didn t suspect that I would make friends with Alexa so much

Once I tried, installing the appropriate application on the phone. However, I was bothered by the fact that the Home button did not much more often call me the needed Google application. Amazon lent me, however, Echo. And my life has changed a bit better.

I am more and more in love with Alex. I do not hide it, it's bothering me that the Amazon's assistant does not understand the Polish language. I consider my English to be fluent, but I think in Polish, so naturally I should be praised in this language. However, when I get through it, Alexa turns out to be an incredibly convenient service. And I do not even have a smart home installation at home.

I communicate with Alexa through the Echo speaker. It changes a lot, because I do not have to reach for the phone anymore so that I can speak to it. And I do not have to delete the Google application as the default for the Home button. I just say Alex and say the command.

Two key features that made Alexa my inseparable friend - modularity and flexibility.

We all know the theory on the understanding of natural speech. What else, however, know and what else to use in practice. I have never seen Alexa understand what I commissioned her. Sometimes I say something indistinctly - because I do not even try to add an accent to my English - but re-pronouncing the phrase does the trick. In addition, such situations are extremely rare. It does not matter if I ask what the weather will be tonight? Do I have to dress up to go outside - Amazon's assistant understands my intentions perfectly. I always get the right answer.

The second of these advantages is flexibility. So the so-called skills, or extensions of the assistant. Alexa does not condemn me to the only right service providers. For example, I do not use the Amazon Music music service. So if I tell Alex, play me psychedelic rock, she'll choose the right playlist from her paired Spotify. There is no problem to pair it with my Outlook or iCloud. Even Polish companies create the right skills: Alex, read me news from Onet - and now I know what is happening in the world and in the country, without having to take your hands off the breakfast being prepared.

How does Alexa work? I had the opportunity to talk to the head of the Text-to-Speech department in Amazon. You can associate it with the Polish IT market.

Before Rafał Kuklinski was involved in the synthesis of speech in Alex and other Amazon products, he developed the Polish-awarded Ivona, which was awarded on international markets. As it turns out, not only speech understanding is a big challenge for developers, but also its synthesis itself.

Maciej Gajewski, Spider's Web: Rafał - for starters, could you bring me closer, for the sake of what exactly do you do in Amazon?

Rafał Kukliński, Amazon: I fulfill two roles. The first is the role of the director of the Amazon Text-to-Speech department, i.e. I am the head of the team that works on the technology of speech synthesis at Amazon. All our "talking" devices - Kindle, Fire, accessibility features, Alex - are being developed by the team in Gdansk. The second function that I am doing is managing the Amazon Technology Development Center. My task is to make sure that we work in a fantastic atmosphere and still attract new, interesting projects that Amazon works on, to Gdańsk.

What exactly is waiting for us when it comes to the development of speech synthesis?

There are many challenges, in virtually every element of speech synthesis. Today, many people are delighted with the "naturalness" of the generated voice and wonder what else can be improved. However, if we step back, for example, for three years and listen to assistants from that period, whose speech we already considered natural at the time, we will notice enormous progress.

Today in Amazon we are working on many innovations. The obvious issue is "teaching Alexa" new languages. In each of them, a good challenge is proper intonation, which will be consistent with the meaning of the text and will allow to pass the content correctly. To this end, we must be able to "understand" the meaning of the text, so the semantics are also important to us. Another, still valid challenge is the normalization of the text, and thus the interpretation of individual symbols depending on the context.

Alexa on Android

The languages ​​themselves are also significantly different. For example, in the Japanese language, the concept of homography takes on a special meaning. Homographs are words that, without changing the record, can have a different meaning and meaning depending on the context. What's more, in Japanese almost every character is a homographer with many meanings and pronunciations. Their proper recognition is therefore critical to the quality of the Japanese speech synthesizer. We have a lot of development here.

I know that an actress is used in the process of building speech synthesis in Cortana. Is it the same with Alexa?

Yes, we work with talented voice actors. We try to make speech synthesis as close as possible to the samples provided by the actress. Our work is that we record it first, and then build a model of speech intonation and generation that can generate speech as close as possible to the actress or actor. This is our measure of naturalness. We check it in the so-called blank test, so we play the voice of the actress and the voice of the synthesizer to people who do not know which voice to whom they belong and ask for the assessment of naturalness. The result of the synthesizer should be as close as possible to the result of the actress / actor.

And is it not that perceived naturalness comes from imperfection? We grunt, break the sentences with eeee or yyyy ? Alexa always says beautifully and correctly ...

There are many elements that influence the perception of the naturalness of speech. In Amazon, we employ sound engineers, linguists, and User Experience researchers (UX), who analyze what affects the perception of speech as a natural one - is it maybe weaving pauses or a break for breath? We are constantly verifying it.

Do you work independently, or do you get the Seattle guidelines, and what should you do / create?

Amazon gives us a lot of independence, which suits me personally. How we achieve our goals depends entirely on ourselves. It is here in Gdansk that we decide which people we want to employ and for which positions. Amazon expects results. This form of cooperation, based on trust, clearly works.

This is when you hire people dealing with Polish speech?

As I mentioned, I am responsible for the text-to-speech section. And in Amazon there is already a synthesis of Polish speech - on many devices, such as Fire tablets or Amazon Web Services. We're working on adding more languages ​​to Alexa, because that's probably what you're asking. We want it to be available on every Amazon device.

I keep my fingers crossed so that we can finally start enjoying the effects of this work. And tell me, does artificial intelligence support the development of speech synthesis? Because of her understanding - for sure.

Yes. It is used in principle at every stage of the development of speech synthesis. Artificial intelligence, in other words machine learning, allows us to create models that will choose the right value based on thousands of factors. We create machine learning models for proper interpretation and development of acronyms, for distinguishing homographs, modeling intonation, sound generation etc. I believe that without SI, we would not be able to handle eg the manual teaching of our Japanese language mechanisms.

And what about Alexa Skills? As Fabrice Rousseu assured me, those will only come.

Amazon is focused on making Alexa's skills as simple as possible. Today - as Fabrice confirms - writing the skills for our voice assistant is even easier than creating a simple extension for the web browser. With the Alexa Skills Kit, anyone can quickly and easily use Amazon knowledge on voice solution design.

Maciej Gajewski, Spider's Web: Let's say I'm a young inventor. I would like my new invention to be able to use Alexa. What exactly do I have to do?

Fabrice Rousseau, Alexa Skill General Manager: Developers do not need to be familiar with natural language development or speech recognition to develop perfect applications for Alexa. We provide resources, tools, templates and support for every interested person. Developers can visit our website (developer.amazon.com/alexa-skills-kit) to learn more about Alexa's skills and start building them today. Of course, you can also transfer to our Amazon Web Services hosting the entire backend for your device.

This happens automatically? I will write this skill, I will put it on the portal and already?

Alexa skills include a voice user interface or VUI (which allows you to "read" the customer's questions) and a back-end service in the cloud (thanks to which Alex "knows what to answer"). Developers can use the Alexa Skills Kit to work on both tools. There are also external companies offering services that accelerate and facilitate the design, prototyping, coding, testing and monitoring of Alex's skills.

I'm guessing Alexia's interest on the part of creators is big. But do you also, in some way, do you take extra interest in this? In the sense: you have a great product, we talk to you so that you can integrate it with Alexa ...

Sometimes companies come to us with their ideas, but our goal is to make this process as self-service as possible. We offer tools and resources that enable companies to independently develop solutions that are tailored to their customers - regardless of whether it is a new Alexa custom skill or the integration of one of them with an existing product. We have a dedicated team of evangelists who are active in the programming community who want to develop their competences.

Alexa is particularly popular among smart home equipment manufacturers. But its potential is much larger. Are you trying to emphasize it in some way so that potential future partners can find a completely new application for Amazon's assistant?

Indeed, we see a lot of interest in our services among companies in the Smart Home industry, but other sectors are also more eager to reach for voice innovations. That is why our team of evangelists is actively working in the programming community so that the process of integrating Alexei with individual products is as simple as possible.

We conduct trainings and encourage you to use our tools in any way. We run webinars, we even have our channel on Twitch for developers interested in Alexa. We create blogs, training materials, organize hackathons and other local events. While training in Amsterdam, I built one skill from scratch in only 30 minutes.

However, Alexa in Poland has a problem. It's called the Google Assistant.

At the moment, Alexa seems to be winning the global market of mobile assistants. It is also available in our country for some time, just like Siri or just Google Assistant.

History tells us that the first one on the market usually gets it. In addition, Google has a great advantage over Amazon - it offers a default set of services on the extremely popular Android operating system. I am afraid that if Amazon continues to postpone the introduction of Polish-language Alex to Poland, it may not be able to deal with Google for some time.

This is, for now, divination of tea leaves. I already know one thing: as soon as I part with Echo borrowed from Amazon, I make my own. It's amazing how many simple household tasks it makes easier for me. And the loudspeaker itself offers nothing better - I planned to pair Echo with my home cinema, but I still did not do it. Somehow, in total, I do not see a greater need for it ...



I did not suspect that I would make friends with Alexa

Comments

Popular posts from this blog

What is VoLTE and how can you activate it on your Xiaomi

So you can check the battery status of your Xiaomi smartphone and how many cycles you have performed

How to exit the FASTBOOT mode of your Xiaomi if you have entered accidentally

Does your Xiaomi charge slowly or intermittently? So you can fix it

Problems with Android Auto and your Xiaomi? So you can fix it

If your Xiaomi disconnects only from the WiFi it may be because of that MIUI setting

How to change the font in MIUI and thus further customize your Xiaomi: so you can change the type, color and size of the letters of MIUI

What is the Safe Mode of your Xiaomi, what is it for and how can you activate it

Improve and amplify the volume of your Xiaomi and / or headphones with these simple adjustments

How to activate the second space if your Xiaomi does not have this option