Fedoraproject doings, last week of june

nirik

2024-06-28 18:54

Hello everyone, long time no blog. :)

For a while now, I (along with adamw and sometimes others) have been posting to mastodon a shortish summary of the things we worked on each day. I'm not sure how useful people find this, but I have seen some indications that some people are reading them. However, 500 characters is pretty limiting. I just have room for most days being pretty terse. So, I thought perhaps it might be nice to start blogging again and perhaps go over the week past with more verbosity and discuss things in more detail than a mastodon toot usually goes into.

This sunday (the 30th), RHEL7 finally reaches end of life. We have been trying to finish up all the migrations away from it over the last few months. Looks like we aren't going to make it 100%, but almost all the things left are internal only and we can clean them up more in the coming week or two. Things always take a lot longer than you might think and some of these things have taken quite a long road, but it's great to see them finally done. A quick list and some thoughts about each:

mailman3 / lists.fedoraproject.org: This one has taken a really really long time and the work of a number of folks. All the packaging work to get it up and going in Fedora and EPEL was a heroic effort. Then deploying staging and getting everything worked out for deployment of the new stack and testing everything and tweaking things. Thank you to all who worked on it. It feels great to have a up to date mailman and a up to date OS it's running on.
fedorapeople/fedoraplanet: Our fedorapeople server (a instance for contributors to share files and such) also has taken a long road. It had our old planet (fedoraplanet.org) running on it, so we had to move that first. It's now running in openshift on a modern stack and getting feeds to aggregate from our account system instead of files on fedorapeople. After that was finally ready moving fedorapeople to RHEL9 was pretty easy and is now all done.
PDC (product definition center) has also taken a long path. We adopted it and it got tangled up in a lot of our processes, then upstream dropped it. We are not 100% done with it yet, but it's very close. The last few items using it should hopefully be moved by monday and we can hopefully turn it off next week. That will be great. It's got a gigantic database, it never updated cleanly, it gets random 500's from time to time and will be great to be rid of.
Some smaller internal only things in the coming weeks: 2 virthosts need reinstalling, but they are blocked on some on-site changes. fedimg will be replaced by our cloud-image-uploader service. Thats very close to done. Our github2fedmsg service is being re-written. We need to look at retiring fedmsg entirely to shutdown it's busgateway. Finally, some sundries servers need reinstalling, but I plan to do that today at least mostly.

So all in all not perfect, but pretty close to the deadline. :)

Another thing I looked at eariler this last week was our power9 builders. The buildvm-ppc64le build vm's are currently our slowest builders and maintainers often have issues with them. I updated the 4 main power9 virthosts to the latest kernel and f40 updates, but I am not sure how much it really helped. The main problem on these seems to be that they use 7200 rpm sata drives. Those are just really not very fast and when you get a bunch of vm's hitting the same raid the seek times just kill you. I did play around with some different raid configurations, but it didn't seem to help too much. I've requested more memory and some ssd's for these machines in next years budget, so hopefully we will get that. I am also considering looking at using iscsi from faster storage. In the mean time, sorry they are slow, doing what I can to mitigate that.

In rawhide news, I have been hitting a weird issue with the kernel and my backups. Normally they take something like 5-7min and I can barely notice when they are happening at all, but lately they cause the laptop to ramp all fans to 100%, become completely unresponsive and take something like an hour. It might be this is fixed in the most recent kernels, will need to do some testing this weekend. It's pretty anoying to test because you start the backup and... have to use the hard power off button if you want to get back to working.

Next week is going to be finishing up rhel7 stuff more and then on to reinstalling/upgrading builders to f40 (we couldn't before now because f40 has new createrepo_c which defaults to zstd, and epel7 couldn't handle that, so we couldn't upgrade). Also next week or week after probibly will need to do another mass update/reboot cycle before the f40 mass rebuild later in july. Then, on to flock in early aug.