Skip to main content

This week in fedora infrastructure

Another week and a second 'this week in Fedora infrastructure' post. :) Early this week I switched one of our backup servers over to ansible from puppet and added some rdiff-backup setup on it. Still need a lot of tweaking before the rdiff-backups are useful, but it's well under way. This should give us some more on-line type backups for things and still leave us with tape for long term needs. I also setup a base instance for QA folks to explore their taskbot setup on. It was pretty simple to add things to ansible and have it create, install and configure the instance for me. Pretty nice to not have any of those steps needing to be manual anymore. Since rawhide is now installable again, I added a option to install from boot.fedoraproject.org. Try it out and let me know if there's any problems with it. I also did a fair bit of prep work for my visit out to our main datacenter next week. As a reminder we have a outage scheduled for tuesday starting at 21UTC. I'm going to be adding some memory to virtual hosts and getting everything up to date on updates. If you're looking for me next week, email may be your best bet.

This week in rawhide for 2013-07-24

Another slightly late edition of this week in rawhide. :) Overall things continue along in a pretty boring manner, with a few exceptions: I ran into a situation the other day where systemd stopped talking to dbus. This is not fun, as you can't start/stop anything or even see status. There didn't seem to be any updates that looked related, and the issue happened overnight. I rebooted and everything was fine, so I haven't filed a bug here until I see it again or can isolate exactly what is to blame. Looks like the issue with NetworkManager taking a few minutes to bring up network on boot is finally fixed up today. Hurray! There's still parts falling into place that we need before mass rebuild happens, but hopefully we will be starting that in early next week. Rawhide users will see close to every package they have installed updated, so get ready for heavy bandwidth usage. The last mass rebuild took around a week or so to fully complete, we might be faster this time, I guess we will see. Astute folks will notice there's no install images for rawhide today. They failed to compose correctly. Looks like a transitory error that will hopefully be fixed tomorrow.

This week in Fedora Infrastructure

I'm a happy consumer of the excellent cyanogenmod android distro, and recently they have started doing a short weekly blog post with interesting things of note that happened in their community over the last week. I figured this might be a nice thing to try and do for Fedora Infrastructure too. Not sure if I will keep it up, but I'll give it a go and if you find it interesting let me know. ;) We had a meeting on IRC with a general overview and some things we need to sort out with our migration to ansible. You can read the meeting minutes at: http://meetbot.fedoraproject.org/meetbot/fedora-meeting-1/2013-07-17/infrastructure-ansible-meetup.2013-07-17-19.00.html help or comments always welcome. Many of the outstanding questions are things we likely will discuss and finish planning at the upcoming flocktofedora conference. There's also now a wiki page with this information on it too: https://fedoraproject.org/wiki/Infrastructure_ansible_migration We just finished up a long outage of our koji buildsystem. It went down at 21UTC on thursday and was back up around 20UTC today. This outage was to move out backend storage for builds. We were bumping up against capacity on our existing storage, so we wanted to move over to a place with more room to grow. Additionally, we tweaked the koji db instance (it's a virtual as so many of our hosts, and it wasn't using virtio for it's disks) to be a lot faster. I also took the change to reinstall all of our builder instances. Ansible did pretty short work of the reinstall/reconfig. About 40 instances reinstalled and updated to the latest packages. We also did clean reinstall and config on all our arm builders (48 now added to the primary buildsys). Look for arm package import and enabling in Fedora 20 soon. ;)  

rawhide week(s) 2013-07-16 edition

Skipped a week there as there wasn't much going on in the rawhide world, just rolling on. There's a few oddities to mention this week however: On recent rawhide kernels, booting up results in no wireless network for about 2min or so, then magically the firmware loads and everything starts working. I'm not sure if this delay is in the kernel, udev, or NetworkManager, but it's kind of anoying. Will try and track it down enough to file a bug soon. Also, and this one is entirely cosmetic, the NetworkManager applet after the first time you connect to a vpn, works fine, but never finishes it's little 'connecting' animation. So, there's a little swirling icon in my system tray forever after the first vpn connection. ;( https://bugzilla.redhat.com/show_bug.cgi?id=985523 There's likely to be a lot of activity soon in the rawhide world, as we near the Fedora 20 mass rebuild. Stay tuned. Finally, rawhide is now once again building nightly boot.iso images. So, you should be able to download those and install rawhide directly from the network (also pxe images are created). There's probably some more work to be done on it, but we could/should also look at fedup support to move to rawhide too.

An Incomplete remembrance of a friend

As many folks may know, Seth Vidal was taken from us this last Monday Night. He will be dearly missed by many. He was part of many families: Fedora, Red Hat, Open Source community, Biking enthusiasts and many many more. I knew him for many years in the Fedora Community, then when I joined Red Hat and started working with him every day I knew him even better. Why did I title this an "Incomplete" rememberance? Two reasons: First of all, it's impossible to convey in something as trivial as a blog post what he meant to all those he touched. He was always ready to be a sounding board for ideas, and more often than not he would get you to rethink some idea in a much more elegant way or get it to lead to a much bigger and impressive idea. He was always looking out for his friends. Either to ask you if all was ok, or commiserate about some problem that was happening. He was a master of open source tools. If you asked some random question on IRC, he would often say "just a second" and come back with a script that did exactly what you wanted. He didn't do this just for people he knew, but anyone with problems to be solved. He was patient with anyone who wanted to learn. He had a terrific sense of work/life balance that I often wished I had. He would always be there if servers crashed or some urgent issue came up, but was quick to push less urgent issues off to ride his bike or walk his dog or spend time with his partner. He had a way of gathering people at Fudcons or the like. After sessions were over you could always find a circle of people around him laughing and talking. He always somehow remembered little things about everyone he knew. What you liked, something interesting in your area of interest he came across. And so much more.  The second reason I call this an Incomplete remembrance is that while Seth is gone, he's not really GONE. He lives on in all the people he touched, all the communities he strengthened and all the code he wrote for the world to enjoy. So, the next time you 'yum update' your Fedora or RHEL machine, or use coprs, or use the ansible yum module, or anything he touched, remember him. The next time someone tells you their crazy idea, ask yourself "What would Seth say?" and help them make their idea more grand and awesome than they could ever have possibly imagined. You will be missed, but never forgotten.

Fedora release days aren't as frantic as they used to be, and why it's mostly a good thing.

Folks who have been involved in the Fedora project for a long time may remember early releases where things were very frantic on release days. Servers and networks melting under downloads, eager users showing up with many questions and bugs, buzz at the first look at something just seeing the light of day. In my opinion recent release days haven't been like that, and there's many reasons for the changes. The easy and cynical answer people would jump to is that there are just fewer people interested in Fedora releases, with all the competing Operating systems and other things with mindshare. There could be some truth to this, but in my opinion the items below also explain things and in a more compelling way. Now a bit of a digression for some history: In the Fedora Core days, the "core" images were created and tested inside Red Hat, so release day was exciting in that only a few folks had ever seen it before release day. Then in the Fedora releases started with the core and extras merge. In those early releases, test / alpha / beta / preview composes were made, but really only distributed to very involved testing folks. Until Fedora 13, releases were made from rawhide content. Fast forward to today and lets look at things that have changed: On moving to the next release before it's out:

  • Since we branch the next release off, it's much easier to jump to the next release before it is released. 
  • There's much less confusion about being on rawhide or not, and also there's only one thing you need to do to cleanly switch to the next release (do a yum distro-sync or re-enable updates-testing once it's disabled right before release).
  • Because we branch off from rawhide, more disruptive changes can land there instead of either messing up the branched setup before release or slowing down maintainers.
On number of milestones/test images:
  • Back in the Fedora 9-13 days there were usually Alpha, Beta and Preview Releases (a few). These days we make a LOT more of them. This makes it easier to jump in and test. For the Fedora 19 cycle for example we did 29 composes (between Test composes, Release Candidates, etc). This is more work for release engineering and QA folks, but I think it's really helped bring more testing in from the community and makes it easier to jump in. 
On distributing test images:
  • General Network Bandwidth has exploded. Where before a test image being downloaded by a bunch of people would saturate the network, it now doesn't really cause nearly as much trouble. 
  • The Fedora mirror network has grown over time and theres a LOT of mirrors available in it now.
  • There's more master mirrors. We have 5 master mirrors in one datacenter, 3 in another and 1 more in yet another.
  • While test images aren't mirrored or distributed via torrent, master mirrors can handle the downloads just fine, the datacenters they are in have lots of BW available.
On general stability:
  • I've been running rawhide full time on my laptop and the number of issues I have hit in the last 6 months has been pretty small. IMHO, our releases have been getting a lot more stable over time. I think there's a lot of reasons for that: Increased QA, better reporting tools, better updates policy, increased number of people who can fix issues, etc. 
  • Since the branched is a even more gated/stable rawhide, it's even more day to day usable, resulting I think in lots of people moving to it in the early milestones where in the past they would wait for release. I used to move to the new release at Beta, then started moving to it at Alpha, now I just stick to rawhide. ;)
On upgrade methods:
  • In Fedora 7 and 8 the only upgrade method was via anaconda in the install media, in Fedora 9-17 we added preupgrade to the mix and finally in Fedora 18 we replaced both preupgrade and the anaconda upgrade in media with fedup. Obviously, if you need install media to upgrade you have to wait for it to exist (ie, release day or close to it). Preupgrade had a file in Fedora infrastructure it would check to see if a new release was available to upgrade to, and that was only flipped on release day, so few people would see or use it before release. With fedup you can run it anytime (Although right now it uses images from Fedora 19 Beta to upgrade). This results in lots more people using it before release. You can use it anytime to go to the branched release. If you are going to upgrade there's little point in waiting for the release. 
  • Yum has gotten much smarter. Starting in Fedora 13 you could use yum's distribution-sync command to sync your local machine to whats in the repositories (upgrading or downgrading). This made is a good deal easier to switch from/to branched releases.
On Misc other changes:
  • In the early days we did the 'bit flip' (change in permissions to make the new release tree readable on the master mirrors) right at release time. The problem there is that then it takes until the next sync of lower tier mirrors before they open up content, this resulted in master mirrors getting clobbered, especially right at release time. So, now we open things up a bit before to allow mirrors time to sync up before the announcement. 
The net result of all these things is that many many folks I talk to today (The day before release of Fedora 19) have already been on Fedora 19 for a long while, so the frantic spike of releases is smoothed out over more time. So, we end up with more testers/users pre-release and this means people who do wait for release day get a much more stable and usable release than before. I think this is overall a win. ;)

Fedora 18 to 19 yum upgrade

With Fedora 19 being final (release is tuesday), I went and upgraded my main virthost and guest from Fedora 18 last night. I used fedup to upgrade a laptop and another machine, but as my main server is a bit more complex, I usually just do a yum upgrade on it. Looking at: https://fedoraproject.org/wiki/Upgrading_Fedora_using_yum#Fedora_18_-.3E_Fedora_19_.28pre_release_branched.29 there is currently only one gotcha noted there (systemd cgroups changing). I didn't hit anything related to that, but I rebooted pretty soon after the transaction finished. Aside: There were a few texlive packages I had installed for some reason, which resulted in a vast increase in texlive packages after the distro-sync, so I went ahead and removed those before syncing to avoid that. After rebooting I was hard pressed to find any problems, but finally did manage to find one: All my mysql using applications were no longer able to connect to the database. Digging into logs showed me that it was a password format change between the old Fedora 18 mysql and new Fedora 19 mariadb. I simply had to go in and set the passwords for those users again and it updated to the new hash setup and started working. Not sure if this is something that could be fixed in a mariadb update or noted in release notes, but it's easy enough to fix up. So far thats it. Everything else is working just fine, no problems at all. I think Fedora 19 is going to be a very good release, hope everyone enjoys it.

This week in rawhide 2013-06-25, a word about comps

So another week in rawhide without too much blowing up, however, astute folks might have noticed that there was no rawhide compose on 2013-06-21. This was due to a typo in the rawhide comps file. What is a comps file you might ask? Well, it's the xml file thats used to determine what packages are in what groups what is shown in the installer and so forth. It's pulled from a git repository everytime rawhide composes and unfortunately if there's an error in it, the entire compose just stops and fails. In theory this sounds kinda fragile, especially since xml is horrible to edit, but in practice it very seldom seems to get broken, and when it does it gets fixed the next day. If you are a maintainer making changes, do try and be extra careful when you do. See: https://fedoraproject.org/wiki/How_to_use_and_edit_comps.xml_for_package_groups for more information on this little corner of the compose world.

Attention Fedora 19 prerelease users

Fedora 19 is winding up to release soon, and so it's that time in the cycle when a new fedora-release package pushes out that disables the fedora-updates-testing repository so folks who install after this point don't get testing packages unless they opt in. That update just pushed into the base Fedora 19 repo today: https://admin.fedoraproject.org/updates/fedora-release-19-2 This means if you have a Fedora 19 install you installed before today you very likely have updates-testing packages installed. You should do one of the following things: a) If you wish to help test updates ( https://fedoraproject.org/wiki/QA:Updates_Testing ) You should simply re-enable the updates-testing repoistory: "sudo yum-config-manager --enable updates-testing" and go on to help test and provide feedback for testing updates. b) If you do not wish to continue to test updates-testing packages, you will need to sync your install with only the released/base/stable repository: "sudo yum distro-sync" and then you can move forward only getting the normal updates. Note that in both cases you should continue to have the base fedora repository and the fedora-updates repository enabled. See 'yum repolist' if you wish to see what you have enabled.

Rawhide week in review 2013-06-11

Another week another rawhide review post. :) This week the main thing that stood out for me was the rsyslogd breakage. A new upstream version landed and it crashed on startup. ;( A bug was filed ( https://bugzilla.redhat.com/show_bug.cgi?id=971471 ) and the maintainer promptly started debugging it. While there's not a fix landed yet, it looks like it's very close to doing so. I'd like to call out the rsyslog maintainer as doing exactly what a maintainer should do, respond promptly on bugs and work hard on fixing the problem. Kudos. Proposals are starting to come in about changes for Fedora 20, and those will likely land before too long. Lets try and keep everything working and happy while they do, ok?