Rawhide: The mid may edition

nirik

2014-05-14 21:03

A few rawhide happenings for you this time:

There's an ongoing issue with single cpu systems not booting. This is a systemd bug. see bug https://bugzilla.redhat.com/show_bug.cgi?id=1095891 If you are running rawhide in a virt system, be sure to give your instance more and 1 cpu.
Our storage issues are all fixed up, and rawhide composes are back to taking around 4-5hours.
Almost all koji builders have been re-installed/updated.
Additionally, we have added some buildhw instances, so we have quite a lot of builders now.
All the side tags in the buildsystem for f21 need to merged in by 2014-05-26 for f21. These are for python, tcl and possibly boost.
The f21 mass rebuild will be started on 2014-06-06, so soon after that there will be a LOT of updated packages in rawhide.

This week in rawhide, the late april edition

nirik

2014-04-29 14:52

Greetings intrepid rawhiders. It's been a few weeks as nothing too exciting was going on, but now I have piled up all sorts of fun rawhide tidbits for you. :) A recent library so version jump in libgcrypt exposed the fact that the remmina package doesn't build anymore. It's not due to the libgcrypt change, but rather the freerdp library it uses not being compatible. Rebuilding for soname bumps sometimes exposes these other issues you wouldn't normally be aware of and shows you packages that don't build anymore. Sadly the remmina maintainer in Fedora doesn't have time for it anymore, so it's been orphaned. Before you go rushing off to take ownership of it, realize it's going to take some work to get going again. There's not been any upstream commits on it since last october and it's a pretty gigantic and complex application. I do hope someone is up for the challenge, as it's a pretty nifty app. For now, I've removed it from the xfce applications group so we can get Xfce live composes in rawhide. A number of changes have been proposed and accepted for Fedora 21 (coming to mirrors near you later this fall). This means that a number of them are likely to start landing in rawhide soon. Maintainers: do remember that you should only push things if they are in a workable/testable state, try not to cause undo problems for rawhide users. There's been some more discussion around signing rawhide packages. Basically the best course of action would seem to be making some small script that watches fedmsg for official builds and then calls sigul to sign them. However, that is going to take some coding (sigul has no way to not require the pass phrase for the user is entered interactively right now). In the mean time as a stop gap we could sign manually, but that means either we gate rpms (only signed ones go out in rawhide composes), or we sign what we can, but other packages go out in composes unsigned. The first would mean more delay the second would mean you couldn't depend on all packages being signed. More discussion ongoing. If you have ideas or want to step up to help write code around this, please do stop by the releng weekly meeting or the releng list. Finally later today we are moving our backend storage. I hope that tomorrow's rawhide will be a good deal faster to compose. We will see.

java remote console on DRAC7 and Fedora...

nirik

2014-04-16 14:05

Just a short blog about accessing the java remote console on DRAC7 using servers from a Fedora client (mostly so I can find this again in a few years and remember what I had to do): If you just try and launch it, everything goes great until the end, and you get "Connection Refused". This is of course a completely wrong error message. Really the problem is that it can't find cacerts. You must: mkdir -p ~/.java/deployment/security ln -s /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.60-2.5.0.16.pre02.fc21.x86_64/jre/lib/security/cacerts ~/.java/deployment/security/trusted.certs Of course you need to fill in your openjdk version that you have installed. Once you do that, launch the console and it should connect and work fine. Many thanks to Patrick Uiterwijk for figuring this out. :)

This week in rawhide (The april edition)

nirik

2014-04-01 10:37

A few jucy tidbits this week for rawhide. First, there was a bug in the grubby package that landed on friday in rawhide. If you installed a kernel after that, you likely had it not boot correctly. If you installed a new kernel (3.14.0 came out monday) with grubby-8.31-1.fc21 installed you will need to edit your /etc/grub2.cfg or /etc/grub2-efi.cfg or remove the kernel(s) with the broken entries, upgrade grubby and reinstall them. Next there was some discussion about getting rawhide signed in the latest releng meeting. There's still no concrete proposal, but we agreed on things we don't want: no passphrases stored on disk, no signing code on the koji hub itself. We are considering some listener that could take a passphrase on startup (only in memory) and listen on fedmsg for builds to sign. That way the hub is completely unaware of it, and we could easily expand it to signing scratch builds with a scratch key, etc. There was also a lot of discussion around if we wanted a dedicated rawhide key or not or if we use normal keys and resign at the branch point. Lots more discussion to happen on this before we get to a concrete proposal, but its moving along. I filed a few tickets on mash (mash is the thing that composes the trees of packages): One has already been fixed and the patch upstreamed: Right now rawhide keeps at least one drpm for every binary package for every arch. This amounts to about 120,000 files in drpm directories. In addition to causing some mirrors problems due to filesystem limitations of files in one dir, the vast majority of them are useless. Most rawhide users update pretty often, and those that don't really shouldn't expect drpms for everything. So, we now will have the ability to keep only the last 7 days of them. This should vastly cut down on those files and also really help compose time. I am looking forward to this landing. The next is around multiple packages. Mash currently has the ability to either gather up the 'latest' package in a tag, or all packages in a tag. We use the 'latest' config since all packages in a tag would be way way too large. I'd like there to be a middle ground, possibly keeping 2 versions of a package (the 'latest' and the one before it, and possibly only for a limited time (say again a week). This way users could downgrade easily if they noticed a problem in the window, but we wouldn't need to double the size of rawhide. This is bug https://bugzilla.redhat.com/show_bug.cgi?id=1082830 The final one is around mashes blacklists and whitelists. Right now mash has a number of different multilib methods (basically 32bit support for 64bit trees). In those there's some packages that are whitelisted (always get added) and some blacklisted (always removed), but this is hard coded into a file in mash, meaning that you have to do a new release or patch locally to get changes in. This should be config options so you can dynamically adjust them on the fly. This is bug https://bugzilla.redhat.com/show_bug.cgi?id=1082832 If anyone wants to work on those, patches welcome. :)

This week in rawhide, the mid march edition...

nirik

2014-03-18 11:05

Time for another exciting edition of this week in rawhide... Last week saw a new systemd version landing in rawhide and with it a few changes to the way systemd-logind worked. This was fine (progress, ever marching along forward), but selinux policy needed to be adjusted for this new setup as well. This caused folks with selinux enforcing (which should be nearly everyone these days) to not be able to login, or be able to login, but logind didn't 'see' you as logged in, leading to odd issues with sound permissions, needing root password to reboot, etc. This is an important lesson in troubleshooting. When you run into issues like the above, one very good thing to look at is if setting your install to permissive selinux mode causes things to work again. You can then file a good selinux-policy bug and get a fix, then re-enable enforcing. Otherwise things have been rolling along pretty smoothly. The intel wireless in my laptop has been kind of flaky for a bit, but the new linux-firmware that came out the other day seems to have helped it some. (It's a 7260AC). Keep on rolling along...

Looking up at the cloud (the Fedora infrastructure private cloud)

nirik

2014-03-13 11:29

We have been running an openstack folsom cloud in Fedora Infrastructure for a while now, and I thought I would take a snapshot of whats running on it right now to see what kinds of things we are using it for. The cloud is currently 7 machines, 1 head node and 6 compute nodes (another one is down with bad disk right now). It was somewhat manually installed about a year and a half ago. This is just a snapshot in time I took right now, so of course instances appear and disapear over time. Also, I was pretty rough in deciding what 'bucket' instances should be listed in, but it gives a bit of an idea of usage: 83 total instances.

copr is the single bigest user, which is not a surprise. 25 builder instances and 2 infrastructure ones (a frontend instance and a backend instance). This number is going to vary a lot depending on how many builds are in it's queue.
13 instances in our 'transient' tenant. These are instances someone has spun up to test something or try doing something.
13 dev servers of various kinds. We have moved almost all our development instances into the cloud. So, it's one per application and is a place for developers to deploy and test things before making releases and pushing changes into our staging infrastructure.
10 'one off' instances. These are things I couldn't easily see what they were for, just instances for some one off projects.
8 instances are used by the python twisted project. We set them up to use instances for CI and testing.
5 instances are 'test' instances. These are usually applications we don't make, but might want to look at deploying. We can spin up a test instance and see if they would meet our needs.
1 artboard instance for our design folks.
1 Fedora continous instance for the ostree folks.

So, you can see, largely we are using our cloud for: copr, test/development/one-offs and a way to host some instances for other projects. We are hopefully getting some additional hardware soon and will be setting that up with a newer openstack on it (hopefully deployed in a less manual way), and then we can migrate the users of this cloud to that one and re-install it. I think we are getting some good use from the cloud, and I only see it growing (if carefully) over time.

Just say no to re-releasing the same version of software

nirik

2014-03-11 18:25

I have seen this a few times this week, so I thought I would make a plea to free software producing projects: Please don't re-release some version of your software with changes without changing the version number too. If you do this, it means your users will not be easily sure what they have. Do they have the initial release? Or do they have a later release with some additional, and likely undocumented changes. If they report a bug to you, it could well be those "last minute" changes caused it. Or fixed it. You can't tell without resorting to checksums. You also cause problems for distributions that package your software. If they build and distribute the first release, then push out the second release with the same version they often will need to mess with their build systems and such to 'rebuild' the same version of something with a different source. Often when projects do this they also make some amount of small changes, but since there has already been an announcement and updates to ChangeLogs and NEWS files, these 'last minute' changes actually are things only those people who comb through the source code control will ever know about. Additionally such 'last minute' changes often bypass testing and checks, which can actually lead to breaking things rather than fixing them. If you often do this sort of thing it can also lead to decreased security as your users don't bother checking checksums, or just assuming you made another 'behind the back' update of the release when it was an attacker doing so. If you find some horrible last minute bug or issue that makes you think you have to fix it asap, please just do another release. Its the transparent, sane thing to do. Once a release is made, it's done and history and shouldn't be changed.

This week in rawhide, the late February edition

nirik

2014-02-27 11:52

Another few weeks, another rawhide weekly(ish) blog. There were a few issues which I didn't hit at all, but others did reported over on the Rawhide watch blog. Namely a selinux issue with gdm and a issue with /var/run not being a link to /run on older installed systems. Go take a look at those and subscribe if you are a regular rawhide user. ;) systemd-210 landed in rawhide. Sadly, it seems to segfault here from time to time ( see bug: 1069752 ). Amazingly things pretty much keep running along fine after that happens. Aside from not being able to run systemctl or have socket units answer and having to hit the power button to shutdown, the machine keeps running along. Hopefully this will get fixed up soon.

Rawhide week(s) for mid February

nirik

2014-02-18 17:58

I've not done a rawhide post in a few weeks because there wasn't much interesting going on. I finally had a few things this last week, so here's another installment. ;) Last week there was a libicu bump in rawhide. It's always a bit of fun when a library that many things depend on changes it's ABI (It's 58 or so packages). In this case the maintainer announced things well in advance, but I at least missed that they were not themselves a provenpackager or had one (or more) lined up to rebuild things. This caused the usual churn while maintainers and provenpackagers got things rebuilt. It also showed another issue that happens from time to time in rawhide: something changes and you need to rebuild your package, so you do, but discover that a earlier change in tools causes it to not rebuild. This happened to the open-vm-tools package. It needed rebuilding for the icu bump, but when the rebuild was done, it was discovered that a glib2 change a while back actually silently made it not build. Rawhide these days produces a boot.iso every day (unless it fails). For a week or so it was producing the image fine, but it turns out that it was not functional. Several folks tried and the network wasn't working. dhclient on the image would fail to run due to a missing library. I looked at dependencies and changes and took a while to find out that lorax (the tool that makes the boot.iso) cleans out some libraries that aren't needed (To save space I assume). This bind library was one of those and removing that was causing the dhclient failure. By the time I had tracked it down, there was already a fix in lorax upstream and a new lorax build the next day to fix it. This teaches us several things: Just because boot.iso is created does not mean it works, and please do report bugs when you see them. A number of people saw this issue, but no one tracked it down or filed a bug until a week or so later. ;( Finally, observant folks might notice that there was no rawhide compose today. The compose process failed with a traceback in yum. There's a likely change that might have caused the issue, so that change was reverted and I restarted the rawhide compose. There may yet be one today. :) Stay tuned for more exciting adventures of rawhide...

This week in rawhide, the late January edition

nirik

2014-01-29 11:03

Another few weeks of rawhide rolling along. Last week (friday) there was an unfortunate soname bump in the bind package. This cased the dhcp package to be unhappy until it was rebuilt. If you were wondering why there were no rawhide images for a few days that was why. I went ahead and rebuilt it sunday. The astute out there may have noticed that rawhide composes are running quite slow the last few weeks. Oddly, I haven't actually had anyone ask me about it or complain, which is kind of sad actually. ;) The problem is the netapp storage that we use to compose on. Storage folks are working on the problem and hopefully it will get fixed up soon. In the mean time composes have been taking 2-4x as long as normal. ;( I've updated all our buildvm* and arm* builders to fedora 20. In the process I reworked out ansible playbooks for them a great deal. They were some of the very first playbooks we ever made in ansible, and were showing their age. Things are much faster and cleaner now. In the process I ran into a kernel problem on arm (fixed in a newer kernel version) and a createrepo problem on the x86 builders (downgraded createrepo for now, there's an outstanding bug to fix the issue). This problem resulted in epel buildroots not regenerating correctly for a day or two. :( Otherwise another few pretty normal weeks of rawhide. ;)