handle_tcpconn_ev(): connect failed

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

handle_tcpconn_ev(): connect failed

Donat Zenichev
Hi.

Recently I've come across with TCP connection problem.
The topology is as following:

DNS srv load balancer - two kamailio proxy servers - one routing server.

Client appeals to NAPTR record like: sip.domain.com
So dns returns one of the proxy servers to client (depending on weight/priority). Now both kamailio have the same priority and weight (the goal is load balancing).

Routing server (now it is asterisk) working with chan_pjsip.so, that supports NAPTR/SRV records.
He is able to resolve Record-Route / Route headers with value - sip.domain.com (that proxy servers add to record-route headers while relaying requests to him).
This topology is done to support present dialogs, even if proxy that recently processed it, is dead.

But the problem comes, when routing server (asterisk) sends in-dialog requests to the proxy, that wasn't used to establish the dialog.
Example, routing server obtains 200 OK from endpoint (relayed by kamailio1 to him) and he sends back ACK, but not to the kamailio1, he sends it to kamailio2 (because he resolves NAPTR sip.domain.com and gets ip of second kamailio). Kamailio2 processes the request as usual, because both kamailio have the same db for dialog module, but when he tries to relay the request to endpoint, he gots the error:
ERROR: <core> [tcp_main.c:4070]: handle_tcpconn_ev(): connect XXX.XXX.XXX.XXX:52185 failed

The port that kamailio2 tries to use to relay the ACK, is port that endpoint used to establish the dialog with kamailio1 and actually his TCP connection is now established with kamailio1.
So kamailio2 tries to use the same port and gets the error.

And this is proper behavior I think.

There is no problem with UDP transport.

Has anyone seen the similar problem? That indeed is not a problem, but proper behavior.



--
-- 
BR, Donat Zenichev
Wnet VoIP team
Tel:  +380(44) 5-900-808
http://wnet.ua

_______________________________________________
Kamailio (SER) - Users Mailing List
[hidden email]
https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
Reply | Threaded
Open this post in threaded view
|

Re: handle_tcpconn_ev(): connect failed

Daniel Tryba-2
On Thu, Sep 07, 2017 at 11:03:49AM +0300, Donat Zenichev wrote:
[snip]

> ERROR: <core> [tcp_main.c:4070]: handle_tcpconn_ev(): connect
> XXX.XXX.XXX.XXX:52185 failed
>
> The port that kamailio2 tries to use to relay the ACK, is port that
> endpoint used to establish the dialog with kamailio1 and actually his TCP
> connection is now established with kamailio1.
> So kamailio2 tries to use the same port and gets the error.
>
> And this is proper behavior I think.
>
> There is no problem with UDP transport.

This problem also exists with UDP when NAT is involved. I don't think
there is anything you could do to solve this problem with TCP/TLS
connections, especially with NAT.

Having a similar setup with failover for the loadbalancers, I take for
granted that TCP/TLS will fail in case of a failover (but UDP will keep
working after failover due to the stateless nature of it). Luckily
kamailio is rock solid and the only reason the TCP sockets fail is a
restart of kamailio on config change.

_______________________________________________
Kamailio (SER) - Users Mailing List
[hidden email]
https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
Reply | Threaded
Open this post in threaded view
|

Re: handle_tcpconn_ev(): connect failed

Donat Zenichev
In reply to this post by Donat Zenichev
> Having a similar setup with failover for the loadbalancers, I take for
> granted that TCP/TLS will fail in case of a failover (but UDP will keep
> working after failover due to the stateless nature of it).
Well, your routing servers use PJSIP for NAPTR resolving?
If so, how have you made it working?
I mean, did you find a solution for TCP connections?



2017-09-07 11:03 GMT+03:00 Donat Zenichev <[hidden email]>:
Hi.

Recently I've come across with TCP connection problem.
The topology is as following:

DNS srv load balancer - two kamailio proxy servers - one routing server.

Client appeals to NAPTR record like: sip.domain.com
So dns returns one of the proxy servers to client (depending on weight/priority). Now both kamailio have the same priority and weight (the goal is load balancing).

Routing server (now it is asterisk) working with chan_pjsip.so, that supports NAPTR/SRV records.
He is able to resolve Record-Route / Route headers with value - sip.domain.com (that proxy servers add to record-route headers while relaying requests to him).
This topology is done to support present dialogs, even if proxy that recently processed it, is dead.

But the problem comes, when routing server (asterisk) sends in-dialog requests to the proxy, that wasn't used to establish the dialog.
Example, routing server obtains 200 OK from endpoint (relayed by kamailio1 to him) and he sends back ACK, but not to the kamailio1, he sends it to kamailio2 (because he resolves NAPTR sip.domain.com and gets ip of second kamailio). Kamailio2 processes the request as usual, because both kamailio have the same db for dialog module, but when he tries to relay the request to endpoint, he gots the error:
ERROR: <core> [tcp_main.c:4070]: handle_tcpconn_ev(): connect XXX.XXX.XXX.XXX:52185 failed

The port that kamailio2 tries to use to relay the ACK, is port that endpoint used to establish the dialog with kamailio1 and actually his TCP connection is now established with kamailio1.
So kamailio2 tries to use the same port and gets the error.

And this is proper behavior I think.

There is no problem with UDP transport.

Has anyone seen the similar problem? That indeed is not a problem, but proper behavior.



--
-- 
BR, Donat Zenichev
Wnet VoIP team
Tel:  +380(44) 5-900-808
http://wnet.ua



--
-- 
BR, Donat Zenichev
Wnet VoIP team
Tel:  +380(44) 5-900-808
http://wnet.ua

_______________________________________________
Kamailio (SER) - Users Mailing List
[hidden email]
https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
Reply | Threaded
Open this post in threaded view
|

Re: handle_tcpconn_ev(): connect failed

Daniel Tryba-2
On Thu, Sep 07, 2017 at 01:05:02PM +0300, Donat Zenichev wrote:
> > Having a similar setup with failover for the loadbalancers, I take for
> > granted that TCP/TLS will fail in case of a failover (but UDP will keep
> > working after failover due to the stateless nature of it).
>
> Well, your routing servers use PJSIP for NAPTR resolving?
> If so, how have you made it working?

I use an all kamailio solution, purely routing on dialog headers (new
dialogs to UACs are routed on Path header added during REGISTER).
 
> I mean, did you find a solution for TCP connections?

But the issue with TCP is the same. On failover/restart (or completly
down in your case), the TCP session is lost, and non recoverable
assuming NAT.  AFAIK there is no solution.

_______________________________________________
Kamailio (SER) - Users Mailing List
[hidden email]
https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
Reply | Threaded
Open this post in threaded view
|

Re: handle_tcpconn_ev(): connect failed

Donat Zenichev
In reply to this post by Donat Zenichev
> But the issue with TCP is the same. On failover/restart (or completly
> down in your case), the TCP session is lost, and non recoverable
> assuming NAT.  AFAIK there is no solution.
It's a pitty, my robust mind tells me, that there is no way to solve it clearly, without stupid "crutches".
But I still hope, that there is a solution to it - I don't want to move away from my idea.
Guys, the question is still opened, if someone can suggest solutions, I will be glad to read it.



2017-09-07 13:05 GMT+03:00 Donat Zenichev <[hidden email]>:
> Having a similar setup with failover for the loadbalancers, I take for
> granted that TCP/TLS will fail in case of a failover (but UDP will keep
> working after failover due to the stateless nature of it).
Well, your routing servers use PJSIP for NAPTR resolving?
If so, how have you made it working?
I mean, did you find a solution for TCP connections?



2017-09-07 11:03 GMT+03:00 Donat Zenichev <[hidden email]>:
Hi.

Recently I've come across with TCP connection problem.
The topology is as following:

DNS srv load balancer - two kamailio proxy servers - one routing server.

Client appeals to NAPTR record like: sip.domain.com
So dns returns one of the proxy servers to client (depending on weight/priority). Now both kamailio have the same priority and weight (the goal is load balancing).

Routing server (now it is asterisk) working with chan_pjsip.so, that supports NAPTR/SRV records.
He is able to resolve Record-Route / Route headers with value - sip.domain.com (that proxy servers add to record-route headers while relaying requests to him).
This topology is done to support present dialogs, even if proxy that recently processed it, is dead.

But the problem comes, when routing server (asterisk) sends in-dialog requests to the proxy, that wasn't used to establish the dialog.
Example, routing server obtains 200 OK from endpoint (relayed by kamailio1 to him) and he sends back ACK, but not to the kamailio1, he sends it to kamailio2 (because he resolves NAPTR sip.domain.com and gets ip of second kamailio). Kamailio2 processes the request as usual, because both kamailio have the same db for dialog module, but when he tries to relay the request to endpoint, he gots the error:
ERROR: <core> [tcp_main.c:4070]: handle_tcpconn_ev(): connect XXX.XXX.XXX.XXX:52185 failed

The port that kamailio2 tries to use to relay the ACK, is port that endpoint used to establish the dialog with kamailio1 and actually his TCP connection is now established with kamailio1.
So kamailio2 tries to use the same port and gets the error.

And this is proper behavior I think.

There is no problem with UDP transport.

Has anyone seen the similar problem? That indeed is not a problem, but proper behavior.



--
-- 
BR, Donat Zenichev
Wnet VoIP team
Tel:  +380(44) 5-900-808
http://wnet.ua



--
-- 
BR, Donat Zenichev
Wnet VoIP team
Tel:  +380(44) 5-900-808
http://wnet.ua



--
-- 
BR, Donat Zenichev
Wnet VoIP team
Tel:  +380(44) 5-900-808
http://wnet.ua

_______________________________________________
Kamailio (SER) - Users Mailing List
[hidden email]
https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users