[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Streamlining the use of Salsa CI on team packages



On 19-09-15 18 h 01, Thomas Goirand wrote:
> On 9/15/19 4:10 AM, Louis-Philippe Véronneau wrote:
>> On 19-09-14 17 h 35, Thomas Goirand wrote:
>>> On 9/13/19 11:08 PM, Louis-Philippe Véronneau wrote:
>>>> On 19-09-13 05 h 57, Thomas Goirand wrote:
>>>>> On 9/5/19 7:40 AM, Louis-Philippe Véronneau wrote:
>>>>>> Hello folks!
>>>>>>
>>>>>> I'd like to propose we start using Salsa CI for all the team packages. I
>>>>>> think using a good CI for all our packages will help us find packaging
>>>>>> bugs and fix errors before uploads :)
>>>>>
>>>>> I would agree *IF* and only *IF* we find better runners than the one
>>>>> currently default in Salsa. The GCE runners are horribly slow (they are
>>>>> the smallest to avoid cost). As a result, some tests may just fail
>>>>> because of that, and it becomes just frustrating / annoying noise.
>>>>
>>>> I never experienced such timeouts, but I guess I don't work on very
>>>> large packages or things that take more than a few minutes to build.
>>>
>>> The issue isn't build time. But when you have unit tests sensitive to
>>> timing. See for example openvswitch:
>>>
>>> https://salsa.debian.org/openstack-team/third-party/openvswitch/pipelines/61713
>>
>> Do you have similar issues running those CI tasks in a private runner?
>> (I'm really curious since I haven't had problems and the Salsa runners
>> don't seem slow compared to the private runners I run on my machines).
> 
> For this particular package, I even had issues with some buildd on some
> slow architectures like older MIPS. Just, with the Salsa default
> runners, it's a complete disaster where most of the tests fails, not
> just a few, because the runner is too slow.
> 
> What this shows is that we should *not* just blindly add the CI to all
> of the team's package. Potentially, this will be a disaster. You may add
> the CI script here and there, but I am warning you: adding it to all
> packages at once is a recipe for a big disaster.
> 
>> Maybe one solution to your problem would be to provide a fast/responsive
>> shared runners to the Salsa Team and tag your CI pipelines to use that
>> runner exclusively [1]?
>>
>> [1] https://docs.gitlab.com/ee/ci/yaml/#tags
> 
> Yes, that's what I've been telling to begin with. We should try
> providing other runners for the team if possible.
> 
>> [1]
>> https://salsa.debian.org/salsa/salsa-terraform/blob/master/environments/prod/runner.tf
> 
> This tells "instance_type: g1-small", which doesn't match any name at:
> https://cloud.google.com/compute/vm-instance-pricing
> 
> Am I right that this is n1-standard-1, which is 1 VCPU and 3.75 GB?
> 
>> It's possible to push to Salsa without triggering a CI run with "git
>> push -o ci.skip" or by including "[ci-skip]" in the HEAD commit message.
>>
>> IIUC, the problem isn't the overall amount of repositories using the CI,
>> but adding a 1000 ones at the same time and overloading the runners.
> 
> Ah, nice, good to know.
> 
>>> 1/ Take a super big care when adding jobs.
>>
>> I feel this is easily resolved by the "-o ci.skip" thing.
> 
> Good!
> 
>> I'm not 100% sure that's a good idea. The Salsa Team has pretty strict
>> requirements about shared runners (they require root on those machines
>> to make sure the .debs created by the runners can be trusted) and I'm
>> happy they do.
> 
> I didn't know, and this makes me question the overall way it works, and
> worries me a lot. ie we should be running on throwaway VMs, rather than
> having a VM we should be able to trust. The way you describe things, I
> wonder how easy it should be to get root on these VMs by running a
> crafted CI job...

Well, runners aren't running the CI jobs directly: everything is ran in
Docker, as that's the only available executor on the Salsa shared
runners. Even then, it uses a Gitlab special sauce, so it's not even
strait up Docker.

>> I really wonder how common the issues you've experienced with the Salsa
>> CI runners are. Has anyone here had similar problems?
> 
> Since we're talking about the smallest type of instance possible at
> google, then other people may have experience the lack of RAM for sure.
> 
>> I'd be fine with 95% of our package using the same default pipeline and
>> the last 5% using something else or disabling it and adding a few
>> comments in d/gitlab-ci.yml explaining why.
> 
> The question is: how do you know who's the 5% that needs a better attention?
I don't really think a failing CI is a big deal. It's not like this will
break any package anyway.

The way I see it, we'd push to all the repositories and then let folks
working on individual packages disable it if it's causing them trouble,
as long as they document why they disabled it.

Since the CI won't be ran on the first push, we won't get an avalanche
of mails saying the CI failed for X Y Z reason.

>> FWIW, I've opened an issue on the Salsa Support issue tracker to see
>> what the Salsa team thinks of this whole discussion [3]
>>
>> [3]: https://salsa.debian.org/salsa/support/issues/170
> 
> Thanks a lot for doing this, taking the time to communicate with the
> Salsa people, etc.
> 
> I'm all for more CI, so feel free to ignore my remarks and go ahead, my
> intention was just bring your attention to things I've seen. If it works
> well, then fantastic! :)

The Salsa folks are asking us to wait a little before implementing such
a change, as it seems they haven't yet decided what the infrastructure
needs are to support a systematic use of Gitlab CI on Salsa. I feel
that's a reasonable request.

I was kindly pointed to this discussion on -devel by the Salsa admins:
[1][2][3][4].

I think we should still try to agree on a draft modification to the
policies. I'll send a recap mail of the discussions we've had to try to
move forward with this.

[1]: https://lists.debian.org/debian-devel/2019/09/msg00217.html
[2]: https://lists.debian.org/debian-devel/2019/09/msg00220.html
[3]: https://lists.debian.org/debian-devel/2019/09/msg00270.html
[4]: https://lists.debian.org/debian-devel/2019/09/msg00241.html

-- 
  ⢀⣴⠾⠻⢶⣦⠀
  ⣾⠁⢠⠒⠀⣿⡁  Louis-Philippe Véronneau
  ⢿⡄⠘⠷⠚⠋   pollo@debian.org / veronneau.org
  ⠈⠳⣄

Attachment: signature.asc
Description: OpenPGP digital signature


Reply to: