How to debug Fedora rawhide compose problems
From time to time rawhide composes fail and are not announced or synced to mirrors. In the past this would happen only if the very basic setup (a mock chroot with the 'buildsys-build' group installed in it) broke. Additionally, in the past rawhide composes where many deliverables failed to compose were still synced out and announced, leading to days when no images were available until the issue was fixed. Now, with the latest version of pungi (The tool that composes Fedora releases, including rawhide), composes can fail if some deliverables (Those marked in the configuration as not failable) didn't complete. So, while rawhide can fail more easily, it also means it's much easier to revert some change that broke images and get that fixed before it lands, and images should be always available. So, how can you tell if a rawhide compose (or some part of it) failed and why? All the pungi logs are avilable and since all the builds take place in koji, anyone can look at them as well. Rawhide composes are of course fedmsg enabled, so you want to look for the https://apps.fedoraproject.org/datagrepper/raw?topic=org.fedoraproject.prod.pungi.compose.status.change topic. Composes can finish with 3 states:
- FINISHED - This means the compose finished and everything in it completed successfully. I am not sure we have yet seen this status in real life. ;)
- FINISHED_INCOMPLETE - This means the compose finished and only failable things failed. This is the "normal" status we see day to day.
- DOOMED - This means the compose failed it's initial very basic setup and/or some deliverable marked as not failable failed. When this happens, it means the compose isn't synced out or advertised. This is the status where we need to find out what caused the problem and fix it and either restart the compose or wait for the next days compose. In IRC or on fedmsg you may see this status as "failed in a horrible fire" as thats what our fedmsg translates it to.