Venture Stories

Seeing Language: How GORIL Helps Visualize Pronunciation for English Learners


GORIL is a digital product that teaches native Japanese speakers proper pronunciation and articulation of the English language through visualization and providing real-time feedback.

They recently launched a new web application, GORIL48, which visualizes the user’s sounds when they speak out loud. An illustrated animation shows its mouth and tongue moving along with the sound, showing the user how to articulate the correct English pronunciation based on the correct mouth movements. But how did Satoshi Yoshida, founder of GORIL, come up with this idea? Read on to hear Satoshi describe how GORIL came to be, from idea to research and ultimately, product launch. 

Satoshi Yoshida, GORIL founder: Satoshi studied abroad at a university in Wisconsin, USA., as an exchange student while in college. After graduation, he joined Mitsui & Co. in the fall of 2021 and was transferred to Moon Creative Lab as an in-house entrepreneur. In April 2023, he launched the iOS app GORIL Beta, and in August 2023, the web app GORIL48.

Is it “food” or “hood”?

The prototype of the idea came to me around 1997. When I was in college, I studied abroad as an exchange student at a university in Wisconsin, but even after six months, my English pronunciation had not improved at all and I lacked confidence.

I struggled with taking notes in class because I could not comprehend what was being said, and even when I tried my best to speak English, I was always asked, "What did you just say?” I was always in a constant state of anxiety.

My interest in the English language started long before college though. In junior high school, my life was all about baseball. I practiced endless hours without a break, aiming for the Koshien (National High School Baseball Championship), and English was the only thing I was interested in other than baseball. I had this vague idea that if I could master English, I would eventually expand my horizons.

When an exchange student from Australia transferred to my school in my second year of high school, I was assigned to be his class guide. At first, he could barely speak Japanese, so I spent a year at school with him, carrying both Japanese-English and English-Japanese dictionaries. However, only having been taught English by Japanese teachers up to that point, I was honestly at a loss with understanding English.

For example, I could not understand the difference between "food" and "hood" at all. I asked the exchange student to pronounce those words for me dozens of times and I repeated it so much that he told me to give it a rest, but still I couldn't understand it – and this problem came up again later in life when I studied abroad in the U.S. Not only was it difficult to properly formulate what I wanted to say in English, but I was also having problems getting him to understand what I was saying through my pronunciation in the first place!

My scores on written tests did not reflect the difficulty I was experiencing with pronunciation – it was a completely different situation. And after much trial and error to figure out what to do at that point, the idea came to me of what would become the prototype for GORIL.

"I could not understand the difference between 'food' and 'hood' at all."

Learning through imitation

Six months into my U.S. study, I was still struggling with my English pronunciation. I decided at that moment that I needed to ask a classmate to teach me. The classmate I reached out to was an English as a Second Language (ESL) major and aspired to become a teacher, so the student was eager to help me.

One day while practicing, I suddenly started looking at my classmates' mouths and imagining the shape of their mouths and the movement of their tongues and muscles. When I was finally told, "That was good," I became aware that the shape of my mouth and the movement of my tongue was important in my pronunciation! Suddenly, I was able to pronounce the words correctly and was given positive feedback.

Overtime, I was able to imitate other sounds as well, and I was able to apply these sounds to words more and more. Eventually, people would tell me, "You have good pronunciation! Where did you learn to speak like that?" I gradually gained the confidence I was lacking and actually wanted to speak more English. This made me think that perhaps my method for pronunciation could be replicated to improve other’s English pronunciation as well.

Still a student, not quite the teacher yet

Since joining Mitsui, I’ve worked mostly in logistics. But in 2015, I was transferred to Singapore as a port operator. At the time, a colleague of mine, who was also transferred, wanted to improve his English pronunciation. I remembered my university days and tried the same pronunciation approach, this time as the teacher and he was the student.

But when I told him, "Try to visualize the opening of your mouth, the position and use of your tongue, and adjust the way you produce the sound," he did not understand what I meant by "visualize." No matter how many times I tried to explain the concept, I couldn’t get it right. He told me, "If I could see the image you are talking about more clearly, I think I would be able to do it." I began to agonize over it.

The port operator I was managing at the time was forming a task force to think of how to create new businesses for the next generation, and I was fortunate enough to join. We didn't really know how to come up with new ideas, and we didn't have much experience, but as the task force members worked with Moon on the ideas, we gradually became fascinated by the novelty of Moon's approach.

At that time, I thought the idea of "visualizing pronunciation" could be potentially materialized at Moon. Fortunately, I got the support of the people around me, including my boss at the time, and I made my pitch in March 2021.

Design research

With Moon’s wide variety of professional resources, we were connected with Takuya Yamagishi, a design researcher who is still currently working on GORIL with us. The research we conducted at the time revealed that 3D models developed to improve pronunciation already existed in the speech therapy industry.

In fact, the existing 3D model closely mimicked the movements of my classmates' mouths when I was studying in the U.S., but it was too complicated for users to understand, which really surprised me.

Of course, I was very particular about the 3D model I had in mind, and some linguists who collaborated with us agreed that a three-dimensional representation would be better to visualize pronunciation, but the users' reactions were different. I think they didn't intuitively know how to imitate it.

We have talked to nearly 200 people, and 100 of which we have provided workshops and lessons to. I came to believe that there needed to be a more intuitive system to understand how one's mouth opens and how the tongue is positioned to produce a certain sound. As a result, I began to think about a two-dimensional interface that would allow me to roughly understand the opening of my mouth and the position of my tongue, and my ideas evolved into the current form of the program.

2D proves better than 3D

The team decided to utilize 2D to teach pronunciation and we have even gamified the experience. We developed familiar games like pinball where users must pronounce vowel sounds in order to control the game.

Goril image 2.jpg
The pictured interface uses formant frequencies to output to two-dimensional coordinates. By objectively capturing the characteristics of the sound and showing them in the visuals, the mouth movements are expressed in a way that is easy to imitate.

This simplification was easy to understand because the way people understand different sounds may differ from person to person. Even when people say that they understand and can pronounce a sound, some might be able to imitate it naturally and immediately, while others might not. Even when it comes to listening, there are some people who can separate listening into many elements and others who cannot.

When you practice English pronunciation with GORIL, you are turning pronunciation into a simple exercise of moving your mouth into a certain shape, and once you learn it, it’s hard to forget – much like riding a bicycle.

The 100+ people who have tried GORIL so far have confirmed its effectiveness, and I am now convinced that anyone can understand it and create a new muscle memory related to English pronunciation, as long as it is visualized in this way.

In my 2021 pitch, my Singaporean English, or “Singlish,” was in full swing because I had been in Singapore for a long time. Some of the Moon Committee members I was pitching to commented, "Satoshi's English is special, isn't it!” But, as I’ve learned from my own experience, English has many different pronunciations to begin with.

GORIL will start with American English, but in the future, we would like to include British English, Australian English, Singaporean English, and many others. Though we have already visualized a portion of the Japanese language, we believe that there is ample potential for visualization of languages other than Japanese and English.



GORIL released the iOS app (beta version) GORIL in April 2023 and its web app GORIL48 in August. GORIL has been actively communicating their work in English pronunciation by holding events, appearing on radio programs, and starting their own podcast, "GORIL Pronunciation Club" (available on Spotify, Apple Podcasts, and Google Podcasts).

In the future, GORIL intends to develop an English-learning support tool that allows parents and children to learn pronunciation together, and to visualize not only words but also sentences. They are also looking at applying this technology for learning languages around the world. For the latest information, please visit GORIL's official website and Twitter.


Check back here frequently as we will be sharing more stories and updates about GORIL.

Share on

Related stories

Tommy Sekiguchi stands out with his mission to transform Japan's coffee scene into a beacon of sustainability, challenging consumers to rethink their purchasing habits.

Read More

Learning how language plays a key role in managing projects and generating successful business, and creating spaces, frameworks, and processes that consider that.

Read More

The question arises: are we truly gathering sufficient information?

Read More

In The News

Moon recently opened its Palo Alto studio doors to the entrepreneur and startup community by holding a breakfast and networking event. Founders, investors, and entrepreneurs attended and shared creati…

VOOX releases the Audiobook Mizukioka’s “What does Artificial Life reveal about us?”. …

Naohiro Hoshino, EIR of MetaJob, was interviewed in Works, the journal of the Recruit Works Institute. Under the theme “The optimal solution between remote and office: who chooses where to work”, the…

See all news

Subscribe to our newsletter