SMS Channel Solutions


(Nikodem Graczewski) #1

Hello everyone!

I’m currently researching options for adding an SMS channel to the OpenLMIS. As far as I understand the requirement is only to be able to send an SMS so there is no need for more complex things like IVR campaigns and similar features. So far, I’ve come up with the following options.

  • Kannel
  • Jasmin
  • RapidPro
  • Nexmo
  • Twilio
  • TextIt

I’ve recapped them all in the “SMS channel solutions” document on our confluence.

I would love to get your feedback on which of the approaches make the most sense to you and whether there are any tools that I might have overlooked during my research. Also, feel free to post any questions or suggestions regarding the linked document.

Best regards,
Nikodem


(Sammie Im) #2

@ngraczewski Is there any information on logging and resend capabilities for each of these services? And if Kannel offers those features?


(Josh Zamor) #3

Thanks @ngraczewski,

(For further clarity in this thread the gap project requirements for notifications are captured at: https://openlmis.atlassian.net/wiki/x/JwBlH)

Your table on SMS providers is helpful for being concise while also covering a broad sampling of technologies and services, however we do need to realize that the comparison in this table isn’t clear because these options are so different. Some of these are service providers (Nexmo, Twilio, TextIt, etc), others are software (Kannel, Jasmin SMS, RapidPro), and at least one is duplicated in a way (RapidPro is the software anyone could install that’s more/less TextIt). We don’t have a clear apples to apples comparison.

Further our requirements are loose on the subject of SMS. If we focused on cost to the implementer in running high volumes of messages we might sort this table one way, whereas if we focused on cost to develop we’d sort this another, and if we focused on cost to implementer to deploy we’d sort yet a third way (e.g. cost to install Kannel is usually a few thousand USD). We’re at the point that work is blocked, so I’d encourage us to focus on adopting an approach that allows us to move forward optimizing for development time, bang for our buck, and future flexibility.

I’d take the software providers off this list straight off. Configuring, deploying and running software are extra costs on development time. Lets for now focus on cloud solutions that let us get started straight away with no further delay and for which we can develop automated integration test suites against.

For the service providers we still don’t have a clean comparison. TextIt is a service that can use service providers such as Twilio and Nexmo, i.e. it provides value on top of those services. If we truly were trying to optimize for time we might use SMS over SMTP, however a quick scan of these services shows that a turn-key solution isn’t quite available. That said at a high level the REST API between these services look simple enough and generally have a number of similarities (most have some way to send a message to a number/URN and use some type of access token).

For the service providers here’s my preference:

  1. TextIt for its value add on top of Twilio that might give us cheap(er) ways to provide for some of the other requirements such as translatable messages, timing, etc. This also gives an implementer great ability, in theory, to optimize for SMS volume pricing if they took on hosting of RapidPro and even Kannel.
  2. Twilio for its focus on being a service provider and its focus on developers. It also has an API for What’sApp when that becomes a requirement.

Above all though I expect that this choice in provider will need to grow to support many providers in the future as implementations optimize for different costs in different regions. Therefore the design for the Notification service needs to well encapsulate the workings of how an SMS is sent away from all the other services. We marked OLMIS-1751 as done so it appears we’re on our way toward achieving this, however I’d stress that if we find further details which need to be encapsulated, that we’d prioritize them now. If we do this correctly, after the 3.6 release, I should be able to work on adding a new provider (Nexmo, Whatsapp, SNS, etc), without touching any of the other services. Are we confident we’ll achieve this?

Best,
Josh


(Carl Leitner) #4

If you are looking to align with some of the other investments we are making through Digital Square, I would suggest that you take a look at RapidPro. We have two investments we are making here:
1)Productization of mHero which links RapidPro to health worker* (iHRIS) and health facility registries using the mCSD standard. This productization will align with the Instant OpenHIE proposal. IntraHealth is leading mHero work and Jembi the Instant OpenHIE work.
2)mHero + DHIS2 workflows for disease surveillance use cases. It will support alerting of health workers using the mACM standard.

Happy to talk through some of the technical aspects.

Cheers,
-carl

*we are adopting the broad WHO definition of health workers - not just clinical care providers but would also include people involved in the medical commodity supply chain.


(Josh Zamor) #5

Thanks @Carl_Leitner, do you happen to know how close RapidPro and TextIt attempt to keep in terms of feature parity? Or better yet do you know if anyone has a knowledge base for gaps an implementer might face if they switch from TextIt to hosting RapidPro on their own?


(Wes Brown) #6

Thanks very much for the excellent overview, @joshzamor!

Is there any concern about adopting a technical approach (the use of service provider) which will require ongoing costs when SMS notifications are sent? One of the reasons that SMS notifications were pushed for by our field representatives during the Zambia meetings was because the government had agreements with the cell network providers for free SMS messaging. Cost was not a consideration and therefore SMS was determined to be the best option for sending certain types of notifications. If we push for an option with ends up with a per-message cost who will pay for it and will it affect the usage overall utility of the SMS notification feature?

It is for that reason that I think that we should at least discuss how much additional work it would be to go with one of the software providers. @ngraczewski Given that your proposed solution was to use Kannel, do you have any thoughts on this?


(Josh Zamor) #7

I presume you mean a service provider here? Sending SMS always has a per-message cost as far as I’ve ever seen.

Hmm, let me attempt to remake my case for TextIt/RapidPro @ibewes.

The advantage of using TextIt with Twilio today is that it’s a turn-key cloud service that we can develop with today, that gives the opportunity to be TextIt with Kannel tomorrow with no change to OpenLMIS. If the cost of TextIt is a concern, then the open source Rapid Pro can be substituted, with or without Kannel. That one API, the TextIt / RapidPro v2 API, covers a wide area for implementers of OpenLMIS to mix and match different combinations of cloud solutions or self hosted options should they desire.

I really don’t want us to spend anymore time on Kannel (or Jasmine). While I presume the API for OpenLMIS to use Kannel isn’t likely much different to develop against than the others, I do know first hand that properly configuring a Kannel server with a telco provider is far from trivial. Given that we have the option to use TextIt/RapidPro to abstract away using Kannel or another turn-key provider, it makes zero sense.

To keep Kannel as an option for our implementers that narrows the choice down to the TextIt/RapidPro api - and I don’t see any reason to host RapidPro on our own for the gap project. From my experience with TextIt’s API in the past, I think the team will find it slightly more difficult to use to send simple ad-hoc messages (what our bare requirements are) than using Twilio directly, however it’s still pretty straight forward.

Proposed next steps:

  • Acquire / repurpose a TextIt account for OpenLMIS. VR has one already, though I believe we’ll want a dedicated OpenLMIS community account. I’ll get this on my todo list to be ready for sprint 118.
  • I’ll create a simple component architecture diagram so we can all visualize the pieces and get to consensus.
  • @ngraczewski and team to create a basic work plan for familiarizing with TextIt/RapidPro and building the integration.
  • @ngraczewski et al on the team: I’d propose that we have a quick review call (could use the Tech Comm call on the 22nd) to go over ensuring the Notification service isn’t leaking details and to check off that we’ve thought about how we’ll do integration testing, managing credentials, etc.

Thoughts?

Best,
Josh


(Chongsun Ahn) #8

@joshzamor When you ask about encapsulation of the workings of SMS, I assume you are talking about the interface the notification service advertises to other services? I believe currently all services, when talking to the notification service, only specify:

  1. Whom to notify
  2. Notification “channels” (like email, SMS, etc.), and
  3. Each channel’s message content

How the notification is actually sent is determined within the notification service itself. Hopefully this is the encapsulation you are expecting.

Shalom,
Chongsun


(Josh Zamor) #9

@Chongsun_Ahn: That’s great. What about this field marked important?

http://build.openlmis.org/job/OpenLMIS-notification-pipeline/job/master/lastSuccessfulBuild/artifact/build/resources/main/api-definition.html#api_notifications_post


(Chongsun Ahn) #10

@joshzamor Ah yes, forgot about that. I believe that is a flag which isn’t formally defined at the moment. Currently, it’s used to indicate the notification should be sent immediately, and not consolidated (with other notifications) or delayed.

Shalom,
Chongsun


(Josh Zamor) #11

Ah, okay. That seems innocuous. Thanks!


(Nikodem Graczewski) #12

Hello everyone,

First of all, I would like to thank for all your comments and apologize for the slow response. I required further research to be able to respond to all the questions and suggestions properly.

@sam.im

I have dig deeper into all the service providers (I have omitted all the software solutions as I have a feeling that we’re leaning heavily away from them) and I can confirm that all of them (Nexmo, Twilio and RapidPro) all have some kind of message status logging accessible through their websites. Unfortunately, I was unable to find any information about re-sending messages when they are not sent successfully. However, all the services provide a method of informing whether a specific message was sent or not.

@Carl_Leitner

Thanks for pointing to RapidPro. I have included it in the comparison.

@ibewes

I have picked up Kannel as it seemed the most flexible in terms of supported protocols and the fact that it didn’t add any per-message cost at the top of the SMS provider cost. The amount of work that would be needed on the notification side is comparable with any other service/software providers as all provide a REST APIs for communication. However, we would also have to work on a way to include Kannel instance in OpenLMIS and make it easily configurable (which might not be the case as @joshzamor mentioned). Personally, I was thinking about creating a docker image for as the one I pointed in the comparison seems to be outdated.

@Chongsun_Ahn

I believe that it is not the purpose of this flag. As far as I understand the intent of “important” flag is to ignore user preferences on whether they want to be notified. Currently, it is used for sending auth-related notifications like password reset emails.

@joshzamor

I totally agree that we’re not comparing apples to apples here. The reason for that is it was challenging for me to find a common ground for all of the solutions to be summarized on a single page. I would also mention that TextIt and RapidPro were included as separate solutions as they differ in how the cost is laid out (upfront versus per message).

Thanks for pointing out that Kannel would be a more costly solution. I wasn’t really expecting the upfront cost to be that high to make the ongoing per message cost to be a cheaper approach. Could you explain why the cost of setting up Kannel is so high?

I’ve looked into Nexmo and Twilio and both have support for WhatsApp, but those features are in early access/beta stages and are marked as Limited Availability. Also, they both seem to be pretty similar to code against and both of them provide Java SDKs.

I can see that we’re leaning towards TextIt as it provides the most flexibility. It has support for Nexmo, Twilio and Kannel and it can even be substituted by RapidPro (as, as far as I know, TextIt is just a provider of RapidPro so the REST API should be the same). So, here’s my concern about it. I know that it provides a mean for message customization but that would only work for the channels handled by TextIt, right? I couldn’t find any information about TextIt supporting email as a channel so we would have to build that functionality into OpenLMIS itself anyway. Considering this, would the per-message cost on top of the cost of Nexmo/Twilio/Kannel still be justifiable?

I can create a ticket for further researching RapidPro/TextIt so someone from team Mind the Gap can look into it. What information/features of TextIt are we interested in? What would be the product of this ticket?

Unfortunately, I will be unable to attend Tech Committee call on January 22nd and thus I would suggest moving the discussion to the next Tech Committee or scheduling a separate meeting to talk about it.

Best regards,
Nikodem Graczewski


(Josh Zamor) #13

Thanks @ngraczewski for the comprehensive followup.

We should take this as a work item. We need to define exactly what this parameter means (i.e. semantics). Given the different but equally plausible definitions I’m wondering if we should not only define what this flag means today, but also consider what we want it to mean in the future. important is so wide that that could cover a lot. Perhaps a different name with a more narrow meaning is needed? I’m going to add this to the Technical Committee’s agenda for an upcoming call.

I won’t pretend to be a Kannel expert, so I can’t say exactly. What I do know is that in a previous project we collected a couple vendor costs for setting up a Kannel server and we also had a partner organization attempt to set it up and failed after months of work. The complexity I observed comes in the relationship of setting up the privileged network connection between Kannel and the telco’s SMSC. With the telco’s in the country’s we tend to work, this process appears to be quite time-intensive to get right.

This is a good point. I woudn’t expect, given the priority of the 3.6 requirements, that we’d focus on this aspect of using TextIt for message content management first. To answer your most immediate question: it is possible to send an email, both from a flow and also apparently with a Zappier integration. There are a number of aspects we’d need to address if we actually wanted to consolidate more functionality to TextIt/RapidPro.

I created a simple component architecture diagram so that we can visualize some of the responsibilities: https://openlmis.atlassian.net/wiki/x/hADUHQ

Good question. I’d like to see the ticket focus away from research gathering and more toward building a simple PoC. As referenced in the diagram above, I’d like to see us focusing on building a quick PoC using either the /{B}/flow_starts and /{B}/broadcasts endpoints. We should focus on gathering some of the limitations of these endpoints (e.g. URN format, sending to only 100 addresses at a time, etc) so that we know how to build the real thing. I’d be more than happy to review the ticket once it’s started.

Noted. When does the decision need to be made? IOW when will people be blocked from starting this work? I’d encourage us to get toward building sooner rather than later - lets discuss as soon as we’re ready.

Still on my todo list is to create a TextIt account for OpenLMIS. I’ll update here once complete.


(Nikodem Graczewski) #14

Hi @joshzamor,

We should take this as a work item. We need to define exactly what this parameter means (i.e. semantics). Given the different but equally plausible definitions I’m wondering if we should not only define what this flag means today, but also consider what we want it to mean in the future. importantis so wide that that could cover a lot. Perhaps a different name with a more narrow meaning is needed? I’m going to add this to the Technical Committee’s agenda for an upcoming call.

I’ve double checked the code and the flag works as I’ve described (overriding “allowNotify” flag). We haven’t yet implemented notification consolidation.

I think that we should definitely discuss the meaning of the flag in the future or rename it to be more meaningful.

Thanks for adding it to the Technical Committee’s agenda, it will be really helpful to discuss it with a broader audience.

I won’t pretend to be a Kannel expert, so I can’t say exactly. What I do know is that in a previous project we collected a couple vendor costs for setting up a Kannel server and we also had a partner organization attempt to set it up and failed after months of work. The complexity I observed comes in the relationship of setting up the privileged network connection between Kannel and the telco’s SMSC. With the telco’s in the country’s we tend to work, this process appears to be quite time-intensive to get right.

Thanks for the clarification!

I created a simple component architecture diagram so that we can visualize some of the responsibilities:https://openlmis.atlassian.net/wiki/x/hADUHQ

Thanks for creating this page. It sums up the responsibilities very well.

Good question. I’d like to see the ticket focus away from research gathering and more toward building a simple PoC. As referenced in the diagram above, I’d like to see us focusing on building a quick PoC using either the /{B}/flow_starts and /{B}/broadcasts endpoints. We should focus on gathering some of the limitations of these endpoints (e.g. URN format, sending to only 100 addresses at a time, etc) so that we know how to build the real thing. I’d be more than happy to review the ticket once it’s started.

I’ve created a ticket for creating the PoC, could you take a look at it? https://openlmis.atlassian.net/browse/OLMIS-5951

Noted. When does the decision need to be made? IOW when will people be blocked from starting this work? I’d encourage us to get toward building sooner rather than later - lets discuss as soon as we’re ready.

As per our Technical Committee call discussion, it would be best for the discussion to take place this or the next week. This shouldn’t be a blocker until Sprint 119 or even 120.

Best regards,
Nikodem Graczewski


(Josh Zamor) #15

When we say “build an integration” do we mean Java code? Something even lighter weight, a document that has sample json to paste, curl commands going to endpoints, etc would also work well-enough to get at the second bullet points intention of finding the pertinent facts of using either the flow_starts or broadcasts resource.

Thanks @ngraczewski


(Nikodem Graczewski) #16

@joshzamor

Yes, I meant Java code. Do we want to go with something lighter instead?

Best regards,
Nikodem Graczewski


(Josh Zamor) #17

If we’re building a proof of concept, something to throw away, that’s just to help inform us on how the API works, then I’d encourage building something lighter than a Java implementation. A walk-through of Curl commands with sample data would be acceptable here.


(Nikodem Graczewski) #18

Hi @joshzamor,

that makes sense to me. I’ve updated the ticket with those information.

Best regards,
Nikodem Graczewski