Interesting issue with DLRs

Stipe Tolj st at tolj.org
Fri Oct 20 16:06:15 CEST 2006


Hi Ben,

Ben Suffolk wrote:

> I have been running kannel for a month or so and its been great.
> 
> I looked at the outstanding DLRs the earlier today and say a few, and  
> identified some as phones that I know people are not using any more,  
> hence no delivery. Thats fine, but then I noticed my number was in  the 
> outstanding DLRs, and after a bit of investigation I knew it was  a 
> message that I had received.
> 
> Looking at the debug from the smsc logs (Im using SMPP, with  postgresql 
> as the DLR storage BTW) I see that what has been happening  is that the 
> DLR is actually coming in a fraction faster that the  submit_sm_resp 
> with the message ID in it. (Or at least the receiver  thread is before 
> the transmitter thread).
> 
> This means the DLR is being ignored as its not in the table, then its  
> gets created and put in the table immediately after. So its then  
> outstanding, and of course the DLR callback is never run.

ok, interesting thing indeed... We need to discuss here if this is a logical PDU 
flow "problem" of the SMPP SMSC, or even if we (kannel) misbehave in terms of 
how threads are processing... But (!) receiver thread inside the smsc_smpp.c 
module handles all PDUs from SMSC. So, if DLR (deliver_sm or data_sm) arrive 
before the submit_sm_resp, then I assume this is a logical misbehaviour of SMSC.

> I wonder if a) anybody else has come across this, or b) you can think  
> of any good ways to make sure the DLRs are not lost. e.g. maybe we  
> store them, and then when we create them we can see the status has  
> already been updated and trigger the callback?

hmmm... good point. I face also some connectivity issues when connecting 2 
independant SMPP client systems with the same SMPP upstream account. Kannel 
receives DLRs for which it has no temp data in DLR storage and hence "discards" 
the DLRs without any meaningfull processing.

We may put any receiving DLRs that we can't match in teh "DLR MT" storage table 
to the "DLR MO" storage table. Hence run 2 tables. When we insert into "DLR MT" 
  table at the point we receive submit_sm_resp, we may check that there is no 
existing entry in "DLR MO" table. If there is, then we have already received a 
DLR for this MT message.

This solves 2 issues:

a) DLR MO tables holds any DLRs that can't be resolved... that means external 
applications can "fetch" the DLRs from DLR MO table to process further on.

b) "race conditioning" between submit_sm_resp with message id and DLR itself can 
be hooked together, so we get the usual HTTP callback even while SMSC sends DLRs 
before.

Opinions by the others for this approach?

> I suspect its because I am connected directly to an operator as  opposed 
> to an aggregator that I am having this occasional (about 30  messages in 
> 600 over 7 days approx) issue.
> 
> I should also say that I set-up and did the operator integration  
> testing with 1.4.0 as 1.4.1 was not out at the time (came out a  couple 
> of weeks after), so my live service is currently running  1.4.0. I will 
> upgrade, but first need to be sure of the effects of  the upgrade, as 
> obviously having been thought the integration testing  I need to be 
> careful about using a different version thats does  something unexpected 
> to the connection (in which case I would be in  danger of loosing the 
> operator connection).

1.4.1 has limited COMPATIBILITY BREAKERS, Please check the NEWS file section for 
the 1.4.1 release which will indicate any serious changes.

   http://www.kannel.org/download/1.4.1/NEWS-1.4.1

In any circumstances 1.4.1 is way BETTER and more RELIABLE then 1.4.0.

> So if you think this issue is only with 1.4.0 then no problem, but I  
> could not see anything in any of the release notes that suggest this  
> has been identified before.

I don't think this is an issue for 1.4.0 only, regarding the DLR handling issue. 
This will be definetly also an issue for 1.4.1 and CVS HEAD.

Stipe

-------------------------------------------------------------------
Kölner Landstrasse 419
40589 Düsseldorf, NRW, Germany

tolj.org system architecture      Kannel Software Foundation (KSF)
http://www.tolj.org/              http://www.kannel.org/

mailto:st_{at}_tolj.org           mailto:stolj_{at}_kannel.org
-------------------------------------------------------------------



More information about the devel mailing list