Kannel not load balancing after restart and messages get stuck inqueue

Alvaro Cornejo cornejo.alvaro at gmail.com
Wed Jun 10 06:56:33 CEST 2009


Hi Nikos

I'm still around here... I've been busy this lasts days... I wish I
was on vacations ;-)

Regarding this patch, -I didn't had time to test it- but seems to
solve the issue making kannel hold the queue assignment until all
smsc's are online. However I do agree with Alan in the fact that that
solution is not suitable since there can be many scenarios where an
smsc will not come online or might be delayed at kannel startup and
this will delay all message traffic.

I'll be for a solution where Kannel recalculate its queue(loadbalance)
for a given destination when a new destination for the given
route/destination is available/unavailable and not only at startup.

Another option might be to make kannel have an intermediate queue that
is not assigned to an smsc until the smsc queue is less than 5min on
the spscific smsc. This way in case of a restart, the 1st smsc to be
available will get a moderate queue and the rest of smsc's will
loadbalace the rest of messages once.

Would something like that might be possible? I fear that an approach
like this one  will require a big remake of kannel queue/load
management and I doubt kannel team will go for it.

Also I stick in the opinion that kannel shouldn't panic if it can't
set active/enable an smsc on startup since there can be other smscs
that can process messages for other routes/destinations. This issue is
more important since the only workarroud  is to edit kannel.conf
comment the offending smsc and restart kannel.... and once the smsc is
fixed, re-edit kannel.conf and restart kannel since so far, there is
no way to make kannel reread the config file.

Once I test the patch I'll let you know. Probably this weekend.

Regards and THANKS for your time & Support

Alvaro

|-----------------------------------------------------------------------------------------------------------------|
Envíe y Reciba Datos y mensajes de Texto (SMS) hacia y desde cualquier
celular y Nextel
en el Perú, México y en mas de 180 paises. Use aplicaciones 2 vias via
SMS y GPRS online
              Visitenos en www.perusms.NET www.smsglobal.com.mx y
www.pravcom.com



On Tue, Jun 9, 2009 at 10:14 PM, Alan McNatty<alan at catalyst.net.nz> wrote:
> Hi Nikos,
>
> OK, yeah I'm not about failure at start-up time. If an SMSC is off-line
> (SMPP timeout/retry during bind) kannel will start (simply config a
> bogus IP address for SMPP to test this out). An example is lets say we
> have 2 connections for load balancing. Now consider 1 smsc being taken
> offline (no tcp connection) as it is upgraded, we can still send through
> the other (this is also useful for DR - disaster recover, etc).
>
> So from my example above only 1 connection would be active the other in
> bind retry loop (wait X seconds, etc) - no tcp connection is established
> to smsc. If we re-started kannel it wouldn't be a good idea to wait for
> both links to become active before pushing messages out - this would
> mean waiting for the upgraded smsc to come back online, no?
>
> Again sorry if I'm talking at cross purposes here.
>
> Cheers,
> Alan
>
> Nikos Balkanas wrote:
>> Hi Alan,
>>
>> If even 1 SMSc in configuration file fails to start, bearerbox will
>> panic. There are several states to any SMSc, like connecting, going
>> down, etc. Bearerbox expects all SMScs to be available on startup. The
>> operative word is "available". If a configured modem is not connected,
>> that's a failure and you would have to take it off configuration to
>> startup.
>>
>> After it starts, SMScs can go up or down. Isolated mode still needs
>> SMScs for outgoing messages. The only mode that doesn't is Suspended.
>> But i don't think that even this will startup without all SMScs available.
>>
>> Available is different than Active. Available means anything but Failed
>> (Dead,etc.). An available SMSc could be down, or coming up or connecting.
>>
>> In this patch, router will wait to start until all SMScs become active.
>>
>> What is DR?
>>
>> BR,
>> Nikos
>> ----- Original Message ----- From: "Alan McNatty" <alan at catalyst.net.nz>
>> To: "Nikos Balkanas" <nbalkanas at gmail.com>
>> Cc: "Alvaro Cornejo" <cornejo.alvaro at gmail.com>; <devel at kannel.org>
>> Sent: Wednesday, June 10, 2009 1:22 AM
>> Subject: Re: Kannel not load balancing after restart and messages get
>> stuck inqueue
>>
>>
>>> Hi Nikos,
>>>
>>> Sorry I haven't reviewed this whole thread in detail but was caught by
>>> .. "bearerbox will not start unless all smscs are available." .. this
>>> worried me a bit.
>>>
>>> I'm concerned that in a DR scenario if 1 connection is down (and not
>>> coming up just yet) then I would like to be able to (re)start no problem
>>> (the other connections would start with this one remaining down ..
>>> trying to bind, etc). Is there not an alternative option in terms of
>>> (from init.d/startup perspective) start isolated, allowing some time
>>> (seconds) for the available connections to bind then setting the
>>> connections to active. Or have I missed the point here.
>>>
>>> Apologies if I'm muddying the water. I will try and follow more closely.
>>>
>>> Cheers,
>>> Alan
>>>
>>>
>>> Nikos Balkanas wrote:
>>>> Dear Alvaro,
>>>>
>>>> I just managed to finish with my responsibilities, not on April 10, as I
>>>> was hoping. Meanwhile, I haven't heard from you in a while, and I hope
>>>> you are enjoying your vacations.
>>>>
>>>> Here is a patch to gw/bearerbox.c and gw/bb_smscconn.c, that doesn't
>>>> start the sms_router, until all smscs (except the FAKE) are ACTIVE. If I
>>>> am not mistaken, bearerbox will not start unless all smscs are
>>>> available. So this is inline with this philosophy.
>>>>
>>>> I have tested it to the extend that it doesn't break anything, please
>>>> let me know if it solves your problem.
>>>>
>>>> BR,
>>>> Nikos
>>>> ----- Original Message ----- From: "Alvaro Cornejo"
>>>> <cornejo.alvaro at gmail.com>
>>>> To: "Nikos Balkanas" <nbalkanas at gmail.com>
>>>> Sent: Tuesday, March 31, 2009 4:56 PM
>>>> Subject: Re: Kannel not load balancing after restart and messages get
>>>> stuck inqueue
>>>>
>>>>
>>>> Hi Nikos
>>>>
>>>> No problem. I undestand we are on a "best effort support XD" and this
>>>> is not a critical issue.
>>>>
>>>> Thanks
>>>>
>>>> Alvaro
>>>>
>>>>
>>>> |-----------------------------------------------------------------------------------------------------------------|
>>>>
>>>>
>>>> EnvΓ­e y Reciba Datos y mensajes de Texto (SMS) hacia y desde
>>>> cualquier
>>>> celular y Nextel
>>>> en el Per�, México y en mas de 180 paises. Use aplicaciones 2
>>>> vias via
>>>> SMS y GPRS online
>>>>              Visitenos en www.perusms.NET www.smsglobal.com.mx y
>>>> www.pravcom.com
>>>>
>>>>
>>>>
>>>> 2009/3/31 Nikos Balkanas <nbalkanas at gmail.com>:
>>>>> Hi Alvaro,
>>>>>
>>>>> Thanx for reminding me. Unfortunately I am still working very hard
>>>>> on my
>>>>> project. I will be finished by the 10th of April. Sorry about that
>>>>> and I
>>>>> hope I am not causing you trouble. I put a reminder on my PC not to
>>>>> forget.
>>>>>
>>>>> BR,
>>>>> Nikos
>>>>>
>>>>> ----- Original Message ----- From: "Alvaro Cornejo"
>>>>> <cornejo.alvaro at gmail.com>
>>>>> To: "Nikos Balkanas" <nbalkanas at gmail.com>; "kannel users"
>>>>> <users at kannel.org>
>>>>> Sent: Tuesday, March 31, 2009 10:17 AM
>>>>> Subject: Re: Kannel not load balancing after restart and messages get
>>>>> stuck
>>>>> inqueue
>>>>>
>>>>>
>>>>> Hi Nikos
>>>>>
>>>>> Had you chance to review this?
>>>>>
>>>>> Regards
>>>>>
>>>>> Alvaro
>>>>> |-----------------------------------------------------------------------------------------------------------------|
>>>>>
>>>>>
>>>>> Env��­e y Reciba Datos y mensajes de Texto (SMS) hacia y desde
>>>>> cualquier
>>>>> celular y Nextel
>>>>> en el Per����, M��©xico y en mas de 180 paises. Use
>>>>> aplicaciones 2
>>>>> vias via
>>>>> SMS y GPRS online
>>>>> Ξ’ Ξ’ Ξ’ Ξ’ Ξ’ Ξ’ Visitenos en www.perusms.NET www.smsglobal.com.mx y
>>>>> www.pravcom.com
>>>>>
>>>>>
>>>>>
>>>>> 2009/3/10 Nikos Balkanas <nbalkanas at gmail.com>:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am working on something very important right now. Could you give
>>>>>> me a
>>>>>> ping
>>>>>> in 2 weeks or so, if no one has addressed it?
>>>>>>
>>>>>> Thanx,
>>>>>> Nikos
>>>>>> ----- Original Message ----- From: "Alvaro Cornejo"
>>>>>> <cornejo.alvaro at gmail.com>
>>>>>> To: "Nikos Balkanas" <nbalkanas at gmail.com>
>>>>>> Cc: "seikath" <seikath at gmail.com>; "kannel users" <users at kannel.org>
>>>>>> Sent: Wednesday, March 11, 2009 5:29 AM
>>>>>> Subject: Re: Kannel not load balancing after restart and messages get
>>>>>> stuck
>>>>>> inqueue
>>>>>>
>>>>>>
>>>>>> Hi Nikos
>>>>>>
>>>>>> for 1... still looking at data/logs... nothing so far.
>>>>>>
>>>>>> for 2... You are right:
>>>>>>
>>>>>> SMSC connections:
>>>>>> id1 �’ �’ AT2[id1] (online 323s, rcvd 48, sent 48, failed 0,
>>>>>> queued 265
>>>>>> msgs)
>>>>>> id2 �’ �’ AT2[id2] (online 314s, rcvd 0, sent 0, failed 0,
>>>>>> queued 0
>>>>>> msgs)
>>>>>> id3 �’ �’ AT2[id3] (online 314s, rcvd 0, sent 0, failed 0,
>>>>>> queued 0
>>>>>> msgs)
>>>>>> id4 �’ �’ AT2[id4] (online 314s, rcvd 0, sent 0, failed 0,
>>>>>> queued 0
>>>>>> msgs)
>>>>>> id5 �’ �’ AT2[id5] (online 314s, rcvd 0, sent 0, failed 0,
>>>>>> queued 0
>>>>>> msgs)
>>>>>>
>>>>>> All messages in queue for smsc-id "id_op1" were asigned to the 1st
>>>>>> available smsc that has allowed-smsc=id_op1
>>>>>>
>>>>>> I'm not developer but think this behavior shall be modified so kannel
>>>>>> will be able to maximize/optimize its resources available. Some sort
>>>>>> of "recalculation of �’ the better route" for messages each x time
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>> Alvaro
>>>>>>
>>>>>>
>>>>>>
>>>>>> |-----------------------------------------------------------------------------------------------------------------|
>>>>>>
>>>>>>
>>>>>> Env��β€��’Β­e y Reciba Datos y mensajes de Texto (SMS)
>>>>>> hacia y desde
>>>>>> cualquier
>>>>>> celular y Nextel
>>>>>> en el Per��β€��� , M��β€��’Β©xico y en mas
>>>>>> de 180 paises. Use
>>>>>> aplicaciones 2 vias
>>>>>> via
>>>>>> SMS y GPRS online
>>>>>> �’ �’ �’ �’ �’ �’ Visitenos en www.perusms.NET
>>>>>> www.smsglobal.com.mx y
>>>>>> www.pravcom.com
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Mar 10, 2009 at 8:57 PM, Nikos Balkanas <nbalkanas at gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> These are some heavy questions dude.
>>>>>>>
>>>>>>> (1) SMS shoudn't get stuck like that so that they require restart.
>>>>>>> Anything
>>>>>>> in the logs?
>>>>>>>
>>>>>>> (2) I could speculate why this happens, but without looking at the
>>>>>>> source
>>>>>>> code it would be guesswork. My guess would be that it involves the
>>>>>>> timing
>>>>>>> needed at startup to designate SMScs as active. Probably this is
>>>>>>> done by
>>>>>>> way
>>>>>>> of configuration order (i.e. id1, id2, etc.). The first active one
>>>>>>> gets
>>>>>>> the
>>>>>>> queue. Bearerbox router checks in every 30".
>>>>>>>
>>>>>>> BR,
>>>>>>> Nikos
>>>>>>>
>>>>>>> ----- Original Message ----- From: "Alvaro Cornejo"
>>>>>>> <cornejo.alvaro at gmail.com>
>>>>>>> To: "seikath" <seikath at gmail.com>
>>>>>>> Cc: "kannel users" <users at kannel.org>
>>>>>>> Sent: Wednesday, March 11, 2009 3:41 AM
>>>>>>> Subject: Re: Kannel not load balancing after restart and messages get
>>>>>>> stuck
>>>>>>> inqueue
>>>>>>>
>>>>>>>
>>>>>>> No, I'm using mysql for dlr storage since otherwise dlr are lost on
>>>>>>> kannel restart
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> |-----------------------------------------------------------------------------------------------------------------|
>>>>>>>
>>>>>>>
>>>>>>> Env�� �’Β½e y Reciba Datos y mensajes de Texto (SMS) hacia
>>>>>>> y desde
>>>>>>> cualquier
>>>>>>> celular y Nextel
>>>>>>> en el Per��  , M�� ��‰xico y en mas de 180
>>>>>>> paises. Use
>>>>>>> aplicaciones 2 vias
>>>>>>> via
>>>>>>> SMS y GPRS online
>>>>>>> ��’ ��’ ��’ ��’
>>>>>>> ��’ ��’ Visitenos en www.perusms.NET
>>>>>>> www.smsglobal.com.mx y
>>>>>>> www.pravcom.com
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Mar 10, 2009 at 8:15 PM, seikath <seikath at gmail.com> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I am not aware of the store_tools.
>>>>>>>> Anyway, I assume you use default kannel store file
>>>>>>>> instead of db dlr storage, correct ?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Alvaro Cornejo wrote:
>>>>>>>>>
>>>>>>>>> Hi List
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I've found the following 2 isues:
>>>>>>>>>
>>>>>>>>> 1) I have several modems for the same operators 10 for op1 and 5
>>>>>>>>> for
>>>>>>>>> op2. For some reason, some messages get stuck into kannel queue and
>>>>>>>>> are not sent to smsc without restarting kannel. I've used Alex
>>>>>>>>> store_tools and verify messages are in queue. All messages are sent
>>>>>>>>> using the same code and use AT-SMSCs.
>>>>>>>>>
>>>>>>>>> 2) When I restart kannel QUEUED messages are sent through only one
>>>>>>>>> at-smsc even if messages have a destination smsc-id specified in
>>>>>>>>> sendsms url call
>>>>>>>>>
>>>>>>>>> This is an snippet of the config I have:
>>>>>>>>>
>>>>>>>>> smsc-id = id1
>>>>>>>>> allowed-smsc = id1,id_op1
>>>>>>>>>
>>>>>>>>> smsc-id = id2
>>>>>>>>> allowed-smsc = id2,id_op1
>>>>>>>>>
>>>>>>>>> smsc-id = id3
>>>>>>>>> allowed-smsc = id3,id_op1
>>>>>>>>>
>>>>>>>>> smsc-id = id4
>>>>>>>>> allowed-smsc = id4,id_op2
>>>>>>>>>
>>>>>>>>> smsc-id = id5
>>>>>>>>> allowed-smsc = id5,id_op2
>>>>>>>>>
>>>>>>>>> etc...
>>>>>>>>>
>>>>>>>>> Note there is no smsc defined with smsc-id=id_op1 nor id_op2 in
>>>>>>>>> config
>>>>>>>>> file.
>>>>>>>>>
>>>>>>>>> When sending the messages I use &smsc-id=id_op1 or id_op2 into
>>>>>>>>> url so
>>>>>>>>> kannel load-balance through the smsc of corresponding operator and
>>>>>>>>> send the message. This works fine until I restart kannel.
>>>>>>>>>
>>>>>>>>> After kannel reset, All QUEUED messages are sent through ONLY
>>>>>>>>> ONE of
>>>>>>>>> the smsc without load-balancing between smsc even though there are
>>>>>>>>> hundreds of queued messages; however if, at the same time, I send
>>>>>>>>> messages to any of the id_op1 or id_op2, this new messages are
>>>>>>>>> correctly load-balanced between smsc-at
>>>>>>>>>
>>>>>>>>> I use:
>>>>>>>>>
>>>>>>>>> Kannel bearerbox version `1.4.3'. Build `Feb 13 2009 17:32:59',
>>>>>>>>> compiler `4.1.2 20070626 (Red Hat 4.1.2-13)'. System Linux, release
>>>>>>>>> 2.6.20-1.2962.fc6, version #1 SMP Tue Jun 19 19:27:14 EDT 2007,
>>>>>>>>> machine i686. IP 10.10.5.2. Libxml version 2.6.29. Using OpenSSL
>>>>>>>>> 0.9.8b 04 May 2006. Compiled with MySQL 5.0.27, using MySQL 5.0.27.
>>>>>>>>> Using native malloc.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Any ideas?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>>
>>>>>>>>> Alvaro
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> |-----------------------------------------------------------------------------------------------------------------|
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Env�� �’Β½e y Reciba Datos y mensajes de Texto (SMS)
>>>>>>>>> hacia y desde
>>>>>>>>> cualquier
>>>>>>>>> celular y Nextel
>>>>>>>>> en el Per��  , M�� ��‰xico y en mas de 180
>>>>>>>>> paises. Use
>>>>>>>>> aplicaciones 2 vias
>>>>>>>>> via
>>>>>>>>> SMS y GPRS online
>>>>>>>>> Visitenos en www.perusms.NET www.smsglobal.com.mx y
>>>>>>>>> www.pravcom.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>
>>>
>>> --
>>> Alan McNatty, Wellington, New Zealand
>>> Catalyst IT Limited <http://www.catalyst.net.nz/>
>>> DDI: +64 4 8032201
>>
>
>
> --
> Alan McNatty, Wellington, New Zealand
> Catalyst IT Limited <http://www.catalyst.net.nz/>
> DDI: +64 4 8032201
>



More information about the devel mailing list