Rawhide: Notes from the trail (2015-12-23)

nirik

2015-12-23 12:47

A short note from the trail for those folks riding along with rawhide this winter: Openqa ( https://openqa.fedoraproject.org/ is running on daily rawhide composes, letting everyone know when the install path is not working as expected. This has already resulted in several quick fixes and bringing things back on track in the last month. It's really nice to get the install path to rawhide to be as stable as the day to day trail ride. There is currently an issue with gtk3 and gtkmm30 using applications crashing on start. The gtk3 folks are working on a fix. If you need something that uses gtk3/gtkmm30 right now, downgrade to gtk3-3.19.4-1.fc24. Happy Holidays and Merry Christmas from the trail!

On giving

nirik

2015-12-22 14:07

Toward the end of the year, many folks thoughts turn to giving. Either giving a monetary donation, or time and energy. I think there's a number of reasons for that: In the US at least tax deductible donations are something people want to get in before the end of the year, people know more about how much they might have to donate, as well as having more time away from the pressures of work to think about groups that might be worthy to donate to. Whatever the case, I thought I would share a few thoughts around such donations. First it's good to ponder on what kind of gift you might be able to provide. Sure, if you have some money you can allocate thats great, but there's lots of other ways to give back. Try asking what you can do to help and most any non profit group will have a long list of helpful things to choose from. Next take a look at what groups you want to support. This of course is highly personal, but do take a look at information about a group before donating to them. For example, many of us in the US will see people outside shops and stores ringing a bell and asking for donations to the Salvation Army. At first glance you might think this a worthy group helping the less fortunate, but looking for further information will show you that the group is highly religious ( for example requiring people they help to listen to them proselytize their beliefs ) and discriminatory to LGBT folks. Additionally, look at how much of your donation might actually get to the people the group purports to help and not get used for "overhead". Legit non profit groups are required to publish their finances. Once you have decided to give to some groups, look and see if there's any matching fundraisers going on or planned. This is where existing members will match new member donations over a time period to try and help the group and add more members. Your donation can go further to help the group if you wait and donate at such a time. Also, check with your employer, many companies have matching programs where they will match your donation to non profit groups. This is another great way to get more money to the non profit you are wanting to help. I've decided to donate some money to three groups this year:

Electronic Frontier Foundation - eff.org. I was able to donate a few weeks ago when they were doing a matching fund drive, and also my wonderful employer (Red Hat) offers matching funds, so I was able to get them 3x what I donated. If you are at all active in on line rights and freedoms you know who the eff is. These folks do great work and everyone should support them.
The Software Freedom Conservancy- https://sfconservancy.org They are also currently holding a matching funds fundraiser and my wonderful employer (Red Hat) offers matching funds, so again I was able to get them 3X the funds of just my donation. These folks are doing great work handling legal and paperwork for open source groups as well as being the only ones willing/able to fight for enforcement of the GPL. They lost a number of spineless corporate backers this year with their GPL enforcement battles and its very important that we help them to keep going.
Colorado Greyhound Adoption - https://coloradogreyhoundadoption.org. This is the greyhound group that I have worked with in the past and got my wonderful pups from. They do great work, but just one dog with high medical bills can wipe out their entire medical budget. Sadly, I wasn't paying attention and there was a matching day on the 8th that I missed, but at least my employer (Red Hat) will match my donation to them, so they get 2X as much as my donation.

I hope you all will think about giving back to the groups that help advance causes you believe in.

Rawhide: notes from the trail (2015-11-16)

nirik

2015-11-16 08:47

There has been a lot of things going on in rawhide of late, so past due for another note from the trail. :) The python 3.5 rebuild has landed. The vast majority of it was done in a side tag by Peter Robinson, Kalev Lember and Robert Kuska (and others!), then merged back into rawhide on friday (the 13th). There are still a number of packages that need fixes to build against python 3.5, expect most of them to get fixed up this week. If you have some of those installed, dnf may well hold back python3 and all the newly rebuilt packages until the ones you have installed are all fixed or you remove them. There was a dracut bug over the weekend (landed friday and fixed this morning) that might result in newly updated kernels not having an initramfs. Make sure you get the version from this morning before installing/upgrading kernels. (That was: bug 1281917) On the kernel front, the 4.4.0 merge cycle has been running the last few weeks, resulting in tons of changes and some fun bugs. If your tun/tap devices don't work, that is bug 1281674 which already has a fix, hopefully upstreamed soon. I've also seen plymouth unable to prompt for decryption on boot (not yet filed), and there was breakage in video for vm's (already fixed upstream hopefully). Likely there will be a 4.4.0 rc1 kernel for rawhide today with many of those fixes. There were no rawhide livecd's or arm appliance images for a few days late last week. This was due to the builders all being moved to Fedora 23 and missing a few packages they needed for koji to be able to create those images. That should be all solved now, however some images still may not compose due to lingering python3 deps. On the graphical front, Gnome logins switched from being Xorg by default and having a special "Gnome wayland" session for wayland to the reverse. Wayland is now the default with the normal "Gnome" session. There is a "Gnome Xorg" session now if you need to fall back to Xorg for any reason. How well this ends up working out will decide if Fedora 24 Workstation will default to wayland or not. In my testing, Gnome/Wayland is prefectly usable day to day now, many of the issues in the past have been fixed up. There's still a lot of polish type things to get right I think, for example I use sloppy focus and it sometimes doesn't act right with wayland. Whew. As usual lots going on in rawhide and always a fun bit of trail riding. Until next time, keep them doggies movin!

Lenovo Yoga 900 and Fedora Review

nirik

2015-11-06 13:55

Just about 2 years ago now I picked up a Lenovo Yoga 2 Pro, which I have been using happily since then. My full review was: Lenovo yoga 2 pro and Fedora review and then a followup a year later at yoga 2 pro: a year later. It's been a great little laptop, but I decided it might be time to replace it soon. Mostly because the warentee on the yoga 2 pro was done, the cpu was showing a bit of age and I was hoping there would be some advancements in laptops over the last 2 years. A few weeks ago, Lenovo came out with the Yoga 900, which was the successor to last years Yoga 3 pro and it in turn my Yoga 2 pro. The stats and early reviews looked pretty nice, so I ordered one. I was hoping for a smooth Fedora experience, but sadly I ran into two issues right away after booting from a Fedora Live USB:

The wireless didn't work. It didn't even see the interface. Of course I ran into this same problem 2 years ago with the yoga 2 pro: it was the ideapad_notebook module not knowing about this laptop and thinking it had hardware rfkill set on all the wireless modules. A quick bit of hacking on that module and it no longer showed the interfaces as hardware blocked. Unfortunately, it still didn't work. I looked then to confirm that the intel 8260 wireless was supported by Linux, and indeed it is. Looking closer, I saw that the pci id of the card I had was not in fact listed in the driver. They had "0x24F4, 0x1130" and mine was "0x24F3, 0x1130". Fixing that quickly in the module got the iwlwifi module working with the interface and all was well. This bug tracks these issues (both of which should be headed upstream): bug 1275490
The touchpad and touchscreen didn't work. This seemed to be a probing issue with the i2c-designware platform. A few posts to the linux i2c-devel list and some exchanges with an intel engineer and I had a one line patch that gets it working. Hopefully that will be upstreamable soon.

With those issues out of the way, I went ahead and did a rawhide install on it, copied my /home over to it and then updated my ansible playbooks that setup all the things I want on my laptop. It was a pretty painless process overall and then I was up and running on the new machine. :) Good things:

All the nice things about the yoga 2 pro came over: It's actually a bit smaller and lighter, the screen resolution is still 3200x1800, it fits in the laptop sleve I got for the yoga 2 pro.
It's really fast. The latest gen i7-6500U's are really zippy.
I've not traveled too much with it yet, but in the short times I have the battery life has been great. Estimating around 12 hours or so, almost 2x the battery life on the yoga 2 pro.
The hinge is a new "watchband" style. I never had any issues with the yoga 2 pro's hinge, but this one looks a good deal sturdier.
16GB of ram (over the yoga 2 pro's 8GB) is nice. Might run some vm's now. :)
The laptop backlight now has 3 levels: bright, low, off. I could see using the low setting in some cases.
It has a USB-C port instead of a mini-hdmi on the yoga 2 pro. I have no USB-C cables, but this seems like it will be a win allowing me to connect (with the right cables) to a DP, hdml, usb, whatever via the USB-C port. I am not clear on if I could charge from this port as well, but I'll see when/if I get a cable. :)
The power port is also a USB 2 port. Could be handy in some times when you don't have power and need to attach another usb device.
The firmware actually has a way to enter secure boot setup mode and add your own keys. The yoga 2 pro didn't have that.

Bad things:

Just like the yoga 2 pro, firmware updates seem to require windows. ;( Lenovo provides handy iso updaters for many laptops, but not this one. Likely because they outsourced the firmware to some 3rd party. ;(

Just different:

The keyboard feel is different. I am not sure if I like it better, don't like it, or just need to get more used to it.
512GB ssd instead of a 128GB in the yoga 2 pro. I wasn't even using the 128, so it doesn't really seem that much different. :)

Overall I think the upgrade is well worth it, but it's all mostly incremental improvements, nothing jaw droppingly awesome. The active warentee, cpu speed, memory and battery life all make it worthwhile IMHO.

The 5 states of the modern sysadmin

nirik

2015-10-12 14:48

I think there's (at least) 5 states you might find yourself in as a sysadmin in these days:

Day to day things that aren't (yet) automated.
Automating and designing for the future.
Fires and outages
Interruptions
Time to dream

In general you will want to move things from the first bucket to the second as much as you can. Automate all the things that happen often, or (re)design things so you no longer have to do such things day to day. So if you find yourself restarting a webserver every day, figure out why it needs that and fix that. Or if you have to spend lots of time processing new lists or resetting passwords, make those so they are self service. This might also include reading emails and lists and feeds, if you spend a lot of time here that you don't get any benefit from, perhaps it's time to drop some of those? Of course outages and critical problems are another chunk of time. When things stop working at 2pm or 2am, you start working on bringing things back to normal. Bucket two helps here as well, if you find out things are often causing outages or problems, redesigning them or fixing the underlying problem saves stressfull outage time. It's important after outages and critical problems to try and have some time from bucket 5 as well. Often just thinking about how the problem happened and spending some time looking at it can give you a idea for a design to use in bucket 2 to fix it once and for all. Interruptions are another chunk of time. While you might think it "only takes a minute" to ping your sysadmin about something, it takes a while to get back to the second and final buckets above where (hopefully) you are spending your time. This is why filing tickets or using some other tracker for your non critical things helps a great deal. Then, they can be processed in the first bucket when you have time to do that and also have a nice record of what you need to work on in the second one. Finally another place that is great for you to have time is a category I call "time to dream". This is usually unstructured time where you as a sysadmin can ponder on how things are setup, look at logs or machines you don't usually deal with, read about tools that you might be able to use to make things better, or just a totally new workflow for something. If you find yourself in a sysadmin job where you spend all or most of your time in buckets 1, 3 and 4 you have a definite problem. So, all you sysadmins out there: are there other states you find yourself in often that I haven't mentioned yet?

Fun with wifi

nirik

2015-10-03 15:23

The last few weeks I ran into some really nasty problems with wifi on my laptop: It started cutting out and dropping me offline for several minutes at a time and was slow and pretty unusable. Of course the first thing I thought of was a bug in the rawhide Linux kernels (I run rawhide full time on my primary laptop), but going back to previous kernels or even Fedora 23 or 22 stable kernels showed the same behavior. Looking at the disconnects and re-associations I was pointed to it possibly being just very congested. So, I moved the channel my AP was on (man are there a lot of access points in this area now) and played around with the settings (My access point is a WNDR3700v2 running OpenWrt). That seemed to help a bit, but not all that much. I also noticed that all the other devices in the house seemed fine (sadly, most of them don't have a good way to show signal strength or status, but they didn't seem to be reconnecting). Along this journey I got to play with the 'iw' and 'nmcli' commands a lot. I would recommend you all take a look at iw, it can sure tell you a ton of information about your wireless connection and how it's working or not working. nmcli has also come a long way. 'nmcli d wifi' is something I am now using all the time rather than looking a the pull down menu of networks. Finally, I swapped out the wireless card in my laptop. I had gotten a replacement one years ago when I got my laptop when I read reports of poor wireless performance, but never swapped it in when the existing card seemed to work. The new card worked TONS better. I can only conclude that the old one was going bad in some way. :( Of course being a computer geek I took this as a great time to look into setting up WDS (Wireless Distribution System, basically a way for one AP to become a client of another and bridge all the connections to/from it, extending your range) in my house. My main access point is upstairs on one end of the house, and I am downstairs a fair bit of the time. I picked up some little NEXX WT3020's for travel (also running OpenWRT), so it seemed it would be easy to setup one of them downstairs as a WDS client connected to the main AP. OpenWRT makes it pretty easy to set this up these days, but there were a few gotchas I ran into: Make sure you set the frequency width to 40 on both the main and client WDS client, and make sure you set the static ip for the client ap right (or else you won't be able to easily get to it to configure it). Otherwise the instructions at http://wiki.openwrt.org/doc/howto/clientmode worked fine. I'm not sure how much better it really makes things, but it's kind of nice to have if I ever need it.

A fully ansible powered Fedora Infrastructure

nirik

2015-09-25 14:45

Just a week or so shy of 3 years ago, Fedora Infrastructure embarked on a journey to migrate to using ansible to manage all our hosts. It's been a long road and one we could have done faster, but we wanted to do things in a transparent and measured way, not rushing and making quick decisions and changes that we would have to clean up later. Today that journey is completed. So why ansible? What made us start this journey? We had a CM setup and it was working, but after weighing the advantages and disadvantages we definitely felt moving to ansible would be worth it. Just a few of the reasons:

We are very much a python shop, ruby syntax isn't hard, but it's just another thing to remember when making changes.
Not having to have agents on all our nodes is a big win memory and resource wise.
Not having to have yet another ssl cert setup for our CM is a great win.
We could drop func (that we were using for some ad-hoc things in favor of just using ansible for those things)
Not having to worry about version skew between our management host and clients was a big win.
Easier to explain and show how things work to new folks was a big win.
Not having to worry about scaling central host to number of managed nodes was a big win.
... and I'm sure more things I am forgetting now...

Of course just because we have finished this migration doesn't mean we don't have lots and lots of other things ahead of us. In 3 years ansible has changed some and we need to go back and clean up some of those things that currently work, but aren't ideal anymore. Ansible 2.0 should be out before too long and we need to look at moving to that and helping out with it's testing whereever we can. I'd like to thank the Fedora Infrastructure team and all those who contributed to our migration efforts as well as the excellent ansible developers and community. We could not have done it without any of you!

Rawhide: Notes from the trail: dnf-1.1.2-2.fc24 will plum rustle your cattle

nirik

2015-09-23 11:30

Just a heads up for you intrepid rawhide users: The latest (as of this writing) dnf update, version 1.1.2-2.fc24 seems to break doing a lot of things you might want dnf to do (like update packages or list them or anything). Bug https://bugzilla.redhat.com/show_bug.cgi?id=1265336 is open to track the issue. In the mean time you may want to manually download 1.1.1-2.fc24 from koji.fedoraproject.org and downgrade and add a 'exclude=python3-dnf-1.1.2-2.fc24' to your /etc/dnf/dnf.conf Keep em movin

Bodhi2 and it's much faster updates flow

nirik

2015-09-20 10:52

Now that we have had bodhi2 in production use for a while, I thought I would talk some about it's backend and how it manages updates pushes. First, for a bit of background, lets mention how bodhi1 did things:

release engineers would gather a list of pending updates and look over them.
release engineers would sign those updates
release engineers would tell bodhi1 to push them. Typically in freezes (alpha, beta, final) the branched updates-testing ones were pushed, then all the stable releases (both testing and stable). Outside of freezes, all branches were pushed at the same time (both testing and stable).
bodhi1 would start going through the list one by one and mashing the packages into repos.
At the end when all branches were done it would go through them one by one and gather updateinfo and other data it needed to inject.
It would then create rpm-ostree/atomic updates for each branch that has one one by one.
Then it would wait for all of them to be synced to the master mirrors.
Then it would go through them one by one and update bugs and note things in the bodhi updates.

As you might imagine this could take a very long time. It wasn't unusual to see this be 12-18 hours or more for the full set. Bodhi2 instead of being single threaded is very much mult-threaded:

release engineers gather a list of pending updates and look them over.
release engineers sign updates
release engineers tell bodhi2 to push them.
bodhi2 looks at all the pending updates: Are any of them security? It will do all those branches that have security updates first.
Threads are fired off for each branch that is to be done (first all security holding ones, then all the rest).
In each of those threads, subthreads are fired off to: mash updates (using createrepo_c instead of createrepo that bodhi1 used), gather updateinfo, create digests, ie, anything that can be done and wait to be collected together at the end is done.
When one branch is done, atomic trees are updated, it's synced out and all work is completed on it (updating bugs, etc).

So, where bodhi1 took 12-18 (or sometimes more!) hours to push all Fedora branches, bodhi2 is taking around 6 hours to push them all. The slowest of the branches is of course fedora-21 stable updates (It takes about 4.5 hours to complete), which makes sense as it's been out for a very long time now and has a very large pile of updates in it. Since things are multi threaded in bodhi2, failures and issues are better as well. If something broke somewhere in bodhi1 pushes, everything ground to a halt until it was fixed and resumed. In bodhi2 if there's an issue, only that particular branch will fail and can be resumed anytime. Everything else completes. Moving forward there's even more improvements we hope to roll out: Right now bodhi handles the mashing of updates on it's own instance, and can be somewhat limited by it's I/O bandwith, so we would like to move that out into koji and have koji do the mashes and create the updates trees. That way it can be farmed out to a bunch of seperate builders. There's some createrepo_c changes hopefully we can get working to add multithreading to delta rpm creation, which may speed things up too. Finally we may tweak the threading to do all the security and non security branches at once (memory and I/O permitting).

rawhide: notes from the trail (2015-09-18)

nirik

2015-09-18 15:32

Long overdue again for another "This week in rawhide", I realized that I haven't been able to do them weekly for quite some time, so I'm changing the name to 'notes from the trail' to play on the rawhide/cattle drive metaphor. There was a good bit of rawhide discussion at flock this year and some of the things discussed there have already come to pass: Namely, Adam Williamson has setup openQA to run tests on branched (what will be Fedora 23) and rawhide. So, we now have some better idea when things are not working or didn't generate images to test, etc. It's sending emails to both fedora devel list and fedora test list. I posted about the flock discussions in https://lists.fedoraproject.org/pipermail/devel/2015-August/213527.html but sadly it devolved into a discussion if we should rename rawhide or not and if so, to what. We still have not yet landed pungi4 for doing the rawhide composes, but hopefully that will happen before too long, and I am not sure the progress of adding taskotron checks to rawhide builds is. Current rawhide should be back to 'mostly signed' packages. I have been signing things as time permits. Finally, I would like to touch on something that has been mentioned in the various renaming rawhide to something else threads: perception. I cannot count the number of times I have seen someone appear in a end user support channel (#fedora, fedora-forum, users mailing list, etc) and say "Hi, I want to run rawhide, what do you think?" to which the answer is variants of "Oh no, rawhide breaks all the time" "It will blow up on you" or the like. On one side I can understand support folks saying that. They don't want someone hitting bugs and getting upset and they don't want someone who isn't experienced in troubleshooting to get in over their head. On the other hand, its just not true anymore. I cannot recall the last serious breakage I ran into. The last bug I filed was about selinux preventing my root /var/spool/cron/root job from running (which was easy to work around and notice). When I last redid the rawhide wiki page I made sure to add a "Audience" to it: https://fedoraproject.org/wiki/Releases/Rawhide#Audience So, next time someone asks you if they should run rawhide, point them there and ask them if that describes them.