[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: golang-github-katalix-go-l2tp



On  Wed, Feb 14, 2024 at 18:44:54 +0100, Simon Josefsson wrote:
> Tom Parkin <tparkin@katalix.com> writes:
> 
> > On  Tue, Jan 23, 2024 at 18:05:23 +0100, Simon Josefsson wrote:
> >> Tom Parkin <tparkin@katalix.com> writes:
> >> 
> >> > Hi Simon,
> >> >
> >> > On  Mon, Jan 22, 2024 at 20:15:11 +0100, Simon Josefsson wrote:
> >> >> golang-github-katalix-go-l2tp
> >> >> https://salsa.debian.org/jas/golang-google-grpc/-/jobs/5191076
> >> >> === RUN   TestBasicSendReceive/5:_send/recv_[::1]:9000_[::1]:9001_L2TPv3_IP
> >> >> level=info function=transport message=retransmit message_type=avpMsgTypeHello
> >> >> level=info function=transport message=retransmit message_type=avpMsgTypeHello
> >> >> level=info function=transport message=retransmit message_type=avpMsgTypeHello
> >> >> level=error function=transport message="socket read failed" error="resource temporarily unavailable"
> >> >> level=error function=transport message="transport down" error="transmit of avpMsgTypeHello failed after 3 retry attempts"
> >> >>     transport_test.go:388: test sender function reported an error:
> >> >> failed to send Hello message: transmit of avpMsgTypeHello failed
> >> >> after 3 retry attempts
> >> >> panic: test timed out after 10m0s
> >> >
> >> > This test is failing to send a packet over an IPv6 L2TPIP socket: it
> >> > will depend on the go runtime support for L2TPIP (which has been in
> >> > for ages), and also the kernel having the l2tp_ipv6 driver loaded.
> >> >
> >> > I'd sort of expect to see messages along those lines when trying to
> >> > open the socket, though, rather than tx/rx failing :-/
> >> >
> >> > I'm not at all familiar with the environment of the Salsa test
> >> > pipeline -- could you expand on what the configuration is here?
> >> 
> >> Thanks for looking at the logs Tom.  I don't really know much about the
> >> environment except for these pointers:
> >> 
> >> https://wiki.debian.org/Salsa/Doc#Runners
> >> https://salsa.debian.org/salsa-ci-team/pipeline/
> >> 
> >> Does it setup a server on ::1 properly?  Any outbound connections?  Only
> >> http(s) is allowed.
> >
> > So I *think* the runtime env is a VM using the "Google Container-Optimized
> > OS".  The fact that the socket opens successfully but the packet is
> > apparently lost is suggestive of some kind of firewalling.  I'll see
> > if I can figure anything out from the Google docs.
> >
> > The tests work OK when run manually and when run as part of the
> > package build here, so I think it must be something specific to the
> > pipeline VM but I'm not sure what at the moment.
> >
> > In terms of the test configuration, it basically opens a socket for
> > each end of the connection and verifies it can send/receive over those
> > sockets.  It's the same test code for each configuration.
> 
> This problem happens for others:
> 
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1063746
> 
> Interestingly the failure seems arch-specific:
> 
> https://ci.debian.net/packages/g/golang-github-katalix-go-l2tp/

Interesting -- thank you for the further information.  The fact it
seems arch-specific is striking as you say, but it's odd that amd64 is
failing since that's what the code has been developed on.

If I can reproduce it in a sid chroot that'll be a good starting point
I think.  I will try this and see if I can get any more information.

I've unfortunately not had time to dig further into the Google VM
docs; so a way to reproduce it outside that environment would be most
welcome.

> It could still well be that something in salsa and debci VM, and the
> #1063746 reporter's machine, that is causing this -- but it seems this
> clearly happens often enough, and is causing build failures checking
> reverse dependencies of several packages going into experimental, so it
> would be nice to fix it.  Do you have any ideas?  Could some test be
> disabled or silenced somehow?  I'm ignoring build failures in
> golang-github-katalix-go-l2tp meanwhile.

Possibly the test could be skipped if we could figure out the root
cause.

I did find when working on Fedora packaging that F38 had a strange
issue whereby the l2tp_ip kernel module was blacklisted, which would
cause the first IP encap test to fail.  Strangely the l2tp_ip6 module
is not blacklisted, so on the second time around the IP encap test
would pass as the l2tp_ip6 module would autoload l2tp_ip as the former
depends on the latter.

There's a fix in go-l2tp upstream for this issue, I'm not sure whether
something similar might apply here.  If I can reproduce the issue I'll
see whether a workaround can be applied.

Thanks again,
Tom
-- 
Tom Parkin
Katalix Systems Ltd
https://katalix.com
Catalysts for your Embedded Linux software development

Attachment: signature.asc
Description: PGP signature


Reply to: