They provide you with a client JavaScript library, and you need a
server account (with an app ID and secret key) to connect your application to
the respoke server and allow clients to communicate with each other.
This is an intuitive approach. As a service developer, you
pay for the server usage, and you do so depending on how many concurrent
clients you want.
I got a testing account, and started trying out the JS
library. The process of building a new application was very straightforward,
and the docs guided me towards building a simple app to make audio calls, and
then video calls as well.
Soon I started thinking: respoke is from Digium, right,
and Digium develops Asterisk. How is it possible that respoke and Asterisk
cannot interconnect? What I’d like to do is place a call from web, and in some
circumstances route it to a SIP client, or a PSTN line, or mobile phone.
It turned out that my expectation was quite justified: 36
hours before the beginning of the Astricon Hackathon, Digium announced chan_respoke, a new module for Asterisk (13) that allows Asterisk to connect as
a respoke client and communicate with JS clients.
So that was the good news. The bad news was that we didn't
have any time to prepare before the Hackathon, so we had about 8 hours to get
up to speed and build something… sexy!
The other service that the Astricon Hackathon was
encouraging to use was Clarify, which provides APIs to upload audio recording
and is able to detect some specific “tag words” from the recordings.
Among the people discussing the formation of a team the most
complete and compelling idea, and one that probably did require all 5 people working
together, was GrannyCall. You can see some details here, with the list of team
members.
GrannyCall was thought as a system for kids to call their
granny (or daddy, mummy, etc.) from a simple web page, and get a score
depending on how their vocabulary was appropriate and rich.
For example, we wanted to give some positive score for words
like “love” or “cookie”, and perhaps a negative score for… well, you can guess
some words that would score badly for a kid talking to his/her granny.
The project was quite ambitious, because the originating
call would have been from a web page built with voxbone’s webrtc library, reach
Asterisk over SIP, and then ring the granny on a web page built with the
respoke client.
We used an Ubuntu VM from DigitalOcean to host Asterisk and
the web servers (nginx) for the two web pages, and an external web server to
interconnect to the Clarify APIs for uploading the recordings (with the desired
tags).
Asterisk needed to be version 13, and chan_respoke was built
and configured. The DigitalOcean box was on public IP so there wasn’t the need
for any specific networking.
The part “kid to voxbone to asterisk” allowed for some
preparation and went smoothly right after the time to build the web server and
upload the client page.
While Asterisk was being built and configured, we built the
granny web page with the respoke library. Again in this case it was quite easy and
quick to have a call between two respoke clients, peer to peer, just to test
the client application on the browsers.
The tricky part was originating the call from Asterisk to
the granny web page, using chan_respoke.
For the sake of testing the connection and media
establishment, we made some calls from the respoke client to Asterisk, hitting
an announcement and an echo test. That worked almost immediately, and it was
great!
Now it was the key moment: can we do the full flow (kid – Voxbone –asterisk –
respoke – granny)? In terms of establishing the call, i.e. signalling, that
worked too just after a few tweaks to chan_respoke’s configuration. But what
about audio?
It turned out that there were some problems in the ICE
negotiation between the respoke client and Asterisk: we had audio only in one
direction. We were using Chrome at that moment, and moving to Firefox didn’t
help, so we did think there could be possibly a bug in the libraries.
Considering the maturity of the libraries, this looked
completely understandable, and the respoke guys spent a lot of time helping us
investigating the problem and trying to find a proper solution before the
submission deadline (this resulted in a patch on the server side being applied the next hours).
Honestly, I was happy that the Hackaton was scheduled for
only a relatively short time as eight hours: would it had been any longer, we probably
wouldn’t have dinner or had a proper sleep (and the 8 (or 9) hours jet lag was
not particularly helpful!).
At submission time, we didn't have two-way audio. Also the
debugging ate precious time to prepare the presentation to the judges, and this
could be the reason why we weren't awarded any prize. Honestly, given the
intensity of the effort and the complexity of the project, I was hoping for at
least an honorable mention, but I hope we can gather again the same team in a
different occasion and bring different results!
Jokes apart, it’s been an extremely useful experience. No
documentation or remote communication can replace the live interaction and
working on a proof of concept – in particular if you have a crazy deadline and
the body full of caffeine and sugar (and a few hours' sleep in the last 36
hours).
Of course we took some shortcuts, like removing any firewall
from the host, use a common linux user, authenticated via password and not SSH
keys, edited files in place, etc. We did those things knowing they weren't best
practices but aiming to complete a proof of concept as quickly as possible.
The takeaway from all this is very simple: if you’re
developing a new technology, or a new solution oriented to developers, do whatever you can to involve the developers in a productive, challenging way. Hackathons represent a great solution, even if confined within a company, department or team. The excitement and the feedback (and debugging) you'll help to generate will have a tremendous value.