Log in

Justin's Journal

> recent entries
> calendar
> friends
> Justin's Home
> profile
> previous 20 entries

Wednesday, January 6th, 2016
4:55 pm - Two kinds of kernel bugs
There are two main types of kernel bugs, those which fail loudly (oops!) and those which do not.  I am not counting new hardware support bugs, as those are really feature requests.  We have said many times recently, the Retrace Server is getting a lot of attention.  This is great, and quickly points out which of those noisy fails the most users are hitting on a given kernel release, and allows us to prioritize based on a combination of severity and the number of users hitting a given bug.  It can make things easier to track down, because we might have some idea of which kernel introduced this bug, and hopefully there is enough meat to the entry for us to know where to start looking for a solution.  Unfortunately, the second type of bug is not so clear.  This is things like regressions in hardware support (my wifi worked fine in the kernel x, but no longer works in kernel y), and various other regressions where silently, things no longer work as they should/used to.   What follows will hopefully make things easier in dealing with those second types of bugs.
There are three very important points that I should get out of the way first:

  • Please test with the latest available kernel for the release you are running.  Any fix will be on top of that, so it saves a lot of time and effort on both parts to know that the bug still exists in the currently supported kernel.  When people file bugs against old versions, the first thing we will ask is if this is still occuring in the latest update.  Basically nothing will happen towards getting your bug fixed until we know that it does.

  • Search bugzilla, and make sure no one else has filed a bug on this issue, if they have, don't open a new bug, comment on the existing bug.  Any new or relevant information can be a big help here. If you see something that looks similar, but not really what you are seeing, file a new bug. Adding that your wifi card doesn't work when docked on a bug about the same type of card not working at all will only cloud both issues, or get your issue ignored when the original is fixed and the bug is closed.

  • Search google, and see if there has been discussion of this issue upstream or elsewhere.  If there has been, mention that in the bug description, link to that external information.  It saves us time, and can make a huge impact in how quickly your bug gets fixed.

As I am sure many of you are aware, bugzilla generates a lot of email. While the web interface does have some interesting search capability, email is the main method of getting notified of new bugs.   The better those initial emails (your bug reports) are worded, the more likely we can have a real understanding of the nature or priority of that bug. Now that retrace is decently usable, I tend to ignore any bugzilla mail from abrt and assume that it will show up in the retrace results if it is a high priority. That doesn't mean those bugs do not get addressed, only that we have a good system for prioritizing those outside of bugzilla.  This saves time and makes it easier to actually read through the new bugs filed against the kernel in other areas.  Bugzilla mail is sorted and works just like any other email list you might want to keep up with, only the "from" field is irrelevant, we can't see who filed the bug when scanning a mailbox.  The first thing we see of any interest is the subject.  Something generic here means I am less likely to take an interest after a long weekend when I have a lot of mail to go through, if your subject is clear and concise, this is more likely to grab attention.  Even better if it includes the first kernel version where this bug was seen.   Now that we have an idea of the actual issue, someone is more likely to take the time to read your description.  Again, clear and concise is good, but actual detail is important. If this is a matter of hardware no longer showing up between two kernel versions, tell us which was the last known good, and which was the first known bad. Mention that a dmesg from bootup with each of those kernels is attached (and attach them!). If behavior changed, explain exactly how it changed, and if you can show output from specific tools which will help, mention it in the description, and actually attach that output.  These types of details tell us that this is a bug where we have some idea of where to begin.   These are the types of details which get me to make sure to open your bug into a tab so I remember to go back to it when I finish scanning the new bug mail.  If this is a catastrophic failure, such as data corruption or the like, please mention that in both the subject and the description, it really helps us prioritize.  Finally, when we ask for more information, please try to respond.  Chances are, if we have asked for more information on your bug, it has been filed away as "done all I can do until I get a reply" which means I am unlikely to look at it again until I get an email saying there is new information.
Sorry this has been long winded, but it is something we on the kernel team spend a lot of time dealing with.  Anything we can do to streamline the process and correctly prioritize will be a big help towards a better working Fedora kernel, and that's good for everyone.

(comment on this)

Friday, October 16th, 2015
12:47 pm - Triggering things off of koji events
There are many reasons you might want to trigger events off of koji builds. I take kernel build completions as a trigger to automatically test the builds. I also use rawhide kernel build starts to kick off the scripts to build rawhide-nodebug kernels. Others might want to rebuild a dependent package, trigger some action off of a build failure, or push a git tree, etc. I thought I would share the snippet of code that actually monitors fedmsg for koji events in case it might be useful to others.

import logging
import fedmsg
import fedmsg.config
import fedmsg.meta

# Setup for the fedmsg listener
config = fedmsg.config.load_config([], None)
config['mute'] = True
config['timeout'] = 0

# Actually start the listener
for name, endpoint, topic, msg in fedmsg.tail_messages(**config):

    # Start looking for the items we care about
    if "buildsys.build.state.change" in topic and msg['msg']['instance'] == 'primary':
        matchedmsg = fedmsg.meta.msg2repr(msg, **config)
        if "completed" in matchedmsg and "kernel" in matchedmsg:
            #do what you want here

That is basically it. A few things to note:

  • The Fedmsg "buildsys.build.state.change" topic relates to official koji builds, if you are looking for scratch builds, you would look for the "buildsys.task.state.change" topic.

  • checking msg['msg']['instance']  will tell you which koji instance the message comes from. If you don't do this, it will come from all instances currently reporting (primary, arm, ppc) this can mean repeated triggers as most packages built on primary are later built on the other instances,

  • matchedmsg is essentially the "human readable message at this point and easy to search. an example message looks like: buildsys.build.state.change -- labbott's kernel-4.3.0-0.rc5.git1.1.fc24 completed http://koji.fedoraproject.org/koji/buildinfo?buildID=691944

  • This isn't the absolute most efficient method here, the msg['msg'] dict contains a good bit of interesting information.  For instance, I could check for completed before running fedmsg.meta.msg2repr on it. This would eliminate more than 50% of those calls, but performance hasn't been an issue. For the ease of blogging this, it is easier to tell you to search for the desired build state by name than tell you that msg['msg']['new'] == '4' means that the build state is now complete.

Fedmsg is also quite useful for several other things in the Fedora infrastructure, and it is very well documented.

(comment on this)

Monday, April 20th, 2015
4:25 pm - Easier kernel-test results submission
Several of you have been running the kernel test suite. We appreciate it! I know it has been a bit of a PITA to get us results by manually submitting them on the website. Today, that has changed. We support automatic submissions now. To enable this, you can copy the config.example file in the git checkout to .config, edit as necessary and your logs will be submitted in the method that you choose. The .config file is fairly simple, and it lets you choose whether to submit anonymously, FAS authenticated in order to get badges, or not at all, and everything will continue to work as it has been.

We appreciate those of you who have taken time to run the test suite on kernel builds. Hopefully this will make the process a bit easier.

(comment on this)

Saturday, August 9th, 2014
9:55 am - The Kernel test front end is live!
We have been working on the kernel test infrastructure for a while now, but we didn't have anywhere for people to actually see the results. Now the front end is live on apps.fedoraproject.org/kerneltest. Not only can you see the results, you can upload your logs, and earn badges for doing so! Right now, running the tests take a little bit more effort on your part, and that is the next thing to be simplified. Here is how things work now:

git clone https://git.fedorahosted.org/git/kernel-tests.git
cd kernel-tests/
sudo ./runtests.sh

When the test suite finishes, it will tell you where the log file is, and point you to the front end site to upload them.
In the future, we will have an opt-in to auto submit logs, and an actual package in Fedora vs having to deal with git directly.
Submitting a test can earn you this awesome badge. Soon there will be additional badges for continued effort in helping us test the kernel.

(comment on this)

Wednesday, December 18th, 2013
10:45 am - Rawhide nodebug moving back to tracking rawhide.
Many of you have been using the Rawhide Nodebug repository to test the 3.12 stable kernels while we got Fedora 20 out the door. We really appreciate the feedback from those who have tested. This is a heads up that in one week (December 25th) the rawhide nodebug repository will return to tracking rawhide and the 3.13 development cycle. Both Fedora 19 and Fedora 20 have the most recent 3.12.5 kernel submitted for updates-testing. If you wish to remain on rawhide nodebug and test development kernels, we really appreciate it. If you wish to remain on a 3.12 kernel, now is the time to disable the rawhide-nodebug repository. To do so, simply flip the value for enabled from 1 to 0 in /etc/yum.repos.d/fedora-rawhide-kernel-nodebug.repo.

(comment on this)

Wednesday, November 6th, 2013
7:57 am - Rawhide nodebug and the 3.12 kernel
We have a slight issue with the 3.12 kernel timing in that it is too late to push it into Fedora 20, but too far away from the Fedora 20 release to just ignore the 3.13 development cycle until release. As a result, we will be tracking 3.12 and stable updates for it in the rawhide-nodebug repository. This gives us a chance to keep it built and tested on all primary architectures, and make sure we are in good shape to push 3.12 out as an update as soon as possible. Once 3.12 can be pushed to releases, the rawhide-nodebug repository will return to doing non debug builds of rawhide, tracking Linus' tree upstream. I will let everyone know that is happening through the same channels with a couple of days notice.

(comment on this)

Friday, October 4th, 2013
9:15 am - Bodhi and Karma
This is a repost from a year ago, but it seems that people have forgotten:

Don't downkarma an update because your bug isn't fixed if the bug is not listed as fixed in the update. You are just keeping users who's bugs *are* fixed from getting their fixes. If the current stable contains the bug, and it is still present in the update, that is not a regression. All of the bugs that are fixed actually make it a better package than the one currently in stable. We push out updates fairly regularly. Those updates typically fix several bugs. If I could put together an update and fix every bug we had, I would do it in an instant! In the meantime do not be so selfish as to try to keep others from getting their bug fixes installed just because your bug is still there.

If you are in doubt, it might be worth reading Fedora QA Update Feedback Guidelines

(comment on this)

Tuesday, January 22nd, 2013
9:43 am - Help figure out the debug slowness
As mentioned in the FUDCon kernel talk, we are trying to figure out exactly what causes the massive slowdown for some people with debug kernels. At this point, debug is completely off in the rawhide kernel. Every update this week will turn on more debug options until we find out which one is causing the slowdown. For this to work, we need people testing rawhide proper (not rawhide-nodebug). So please, if you can update daily and give us feedback when you hit a wall, we would really appreciate it. Feedback should be sent to kernel@lists.fedoraproject.org

(comment on this)

Saturday, January 12th, 2013
9:15 am - Fedora 18 kernel 3.7.2
Kernel 3.7.2 is making its way into updates-testing to make a zero day update. We would love to see as much testing as possible, and we need your help. Give it a spin and let us know how things work out. Let us know what you find. Thanks!

For those curious about the baserelease of 201, it came up in the community kernel meeting on Friday. To facilitate upgrade path, F17 will start baserelease at 101, F18 at 201, F19 when it branches will start at 301. This will help ensure that the upgrade path is maintained.

(comment on this)

Wednesday, December 19th, 2012
10:09 am - 3.7.1 kernel in the rawhide-nodebug repository
As Josh pointed out yesterday, the 3.7 kernels are in a holding pattern for Fedora until the Fedora 18 release. There are good reasons for this, and I won't rehash them here. For those who want to run 3.7 kernels, I have put the 3.7.1 update in the rawhide-nodebug repository. Rawhide is holding on 3.7 until 3.8 the merge window is closed, and with the vacation season starting next week, probably through the remainder of the year. I will continue updating the 3.7 stable kernels in rawhide-nodebug until rawhide starts moving forward with 3.8 kernels.
Thursday, November 8th, 2012
12:24 pm - Announcing the rawhide kernel nodebug repository
It has been discussed in the past that we should have a repository of
the rawhide kernels with debug turned off to encourage more users to run
the latest upstream snapshots. That repository now exists. You can enable
it by dropping fedora-rawhide-kernel-nodebug.repo into /etc/yum.repos.d
and doing a yum update. This will contain the (almost) daily rawhide updates
built with debug turned off.

Bugs against this kernel should be filed in bugzilla against the rawhide
kernel. Any questions or comments about the repository itself should be
sent to the fedora kernel list.
Thursday, August 9th, 2012
4:43 pm - August 14th Kernel Regression Test Suite Virtual FAD
The kernel team is hosting a Virtual Fedora Activity Day to get the ball
rolling faster on the kernel regression test suite and we need your
help! We are looking to get new tests written, possible framework
improvements, and generally make the test suite more robust.

When: Tuesday August 14th
Where: #fedora-kernel on freenode IRC
Source: git://git.fedorahosted.org/kernel-tests.git

All details are available at:
Friday, June 15th, 2012
2:34 pm - Bodhi and down karma...
Don't downkarma an update because your bug isn't fixed if the bug is not listed as fixed in the update. You are just keeping users who's bugs *are* fixed from getting their fixes. If the current stable contains the bug, and it is still present in the update, that is not a regression. All of the bugs that are fixed actually make it a better package than the one currently in stable. We push out updates fairly regularly. Those updates typically fix several bugs. If I could put together an update and fix every bug we had, I would do it in an instant! In the meantime do not be so selfish as to try to keep others from getting their bug fixes installed just because your bug is still there.

(1 comment | comment on this)

Tuesday, May 8th, 2012
9:43 am - Uverse gone wrong
Yesterday morning I woke up to find that my internet service was no longer working. Go U-verse. It seems it went down around 4AM, and wasn't coming back. I called to have them reboot the box since I couldn't log into the router myself, and tried a hard reset. It wasn't working. They scheduled someone to come out that day, and eventually they arrived. After replacing my router, it still wasn't coming back. They replaced parts all over the neighborhood to try and fix things. Eventually he did a full reset, which made service come up for about 30 seconds, then it pulled an update and did not come back. After a bit of back and forth, and a lot of time, they finally got me working. This alone was annoying, it cost me several hours of my time, but it gets much worse.

For a bit of background, I have had my home network set up on 10.13.66/255 since 1998. The reasons that netblock was chosen no longer exist, and haven't in a decade, but I also didn't have much reason to change it. A few years back, I set up a DHCP server, though lots of pieces on my network were still on static addressing (again, no reason to change it at the time). This includes a lot of appliances (squeezeboxes, my aquarium) which don't have keypads for addressing. I have scripts written/used which go by address instead of name, I can tell you the IP of any box in my house as easily as I can tell you the name. It's a practice I have moved away from, but 10+ years of cruft that was working, I didn't bother changing.

The latest firmware for U-verse disables 10. networks from their "default dhcp configuration" Again, no big deal, I don't use their dhcp service since I can't set it for pxe, I use my own. I just go into manual configuration and set the router to and I am done. Except that I can't. Their firmware now states that 10. addresses are not allowed for the router. Now, on top of my hours of downtime because the service wasn't up at all, I have to readdress my entire network. I was given no warning that this was coming, no time to prepare for it, if I wanted things working, I had to either put another router in front of that one, or readdress everything. Thanks again AT&T. I finally have everything working, but I have to ask why such a change would be made, with no notification to customers that it was coming. Why was I given a firmware in the first place that would brick my box? This wasn't the networking, as the reset put it back to 192.168.1 addressing and when it updated, it still bricked.

The last of the mess hit me this morning. I have a script that backs up lots of important things from my file server. I do not keep the backup mounted, just as an extra precaution. The script mounts, backs up, unmounts. I have been doing this for years as well. Laziness prevailed and I never bother to check that the mount was successful before I started the backup. This has worked for years, so I never thought anything of it. Ooops, forgot to readdress the nas frontend for backup, so my / filesystem was full on my mail/file server this morning instead. On the bright side, the script now checks for the mount before doing the backup, and emails me if the mount is unsuccessful.

Today I am hoping for a much less frustrating day.
Thursday, October 20th, 2011
12:26 am - Fedora 16 Cloud Test Day
Just a reminder that today is Fedora Cloud SIG Test day. Of particular note, we are testing EC2 images, Aeolus, Openstack, and HekaFS. We need your help! Please join us, lend a hand, test things out, and give your feedback. More information is available on the test page, and we will be hanging out in freenode IRC in #fedora-test-day.

(comment on this)

Wednesday, October 12th, 2011
5:08 pm - Fedora 16 Beta images for EC2 are available
Just a quick note to let people know that Fedora beta images for EC2 are available, and we would love any testing you might have time for. A list of the beta AMIs as well as all other supported Fedora AMIs is maintained by the Cloud SIG here. Bookmark it as it will point to final images for Fedora 16 as well as all future supported releases. Feedback on these images is welcome:

Through IRC on Freenode: #fedora-cloud
Through the email list: cloud@lists.fedoraproject.org

Also, mark your calendars, the Cloud SIG has a test day coming up on Thursday 10/20. More info will be coming across soon!

(comment on this)

Wednesday, September 14th, 2011
4:18 pm - Fedora 16 Virtualization test day tomorrow (September 15th)
Just a reminder that this Thursday (tomorrow) Sept 15th is Virt test day. Test cases and information are available on the Fedora Wiki. We will be hanging out in #fedora-test-day on freenode IRC. If you have any cycles to come help out, we would greatly appreciate it.
Should this message reach you after the 15th, your tests are still valid. Any bugs we can squash before release help out!
Wednesday, August 3rd, 2011
9:22 pm - Fedora 15 EC2 Test Day Tomorrow
This is a little later than originally planned, but the Fedora 15
(yes, Fedora 15 – not a typo) EC2 test day will be on this Thursday
2011-08-04 [1].

Fedora 15 AMIs are available for testing and are listed on the test
day wiki page [1]
. The tests are designed to ensure basic functionality
for the AMIs (MTA, httpd, yum etc.).

Since these tests require an Amazon AWS account, we are offering some
compensation (up to US$5) for the first 10 people to go through the EC2
test cases. This will be done on a first come, first served basis -
make sure that you contact rbergeron to verify that you are one of the
10 people or you may not get the credit.


PS – If you have the means to pay for the EC2 time or a free account,
please use that. We’re just trying to make sure that everyone who wants
to participate can.



(comment on this)

Wednesday, April 13th, 2011
4:40 pm - Fedora 15 Virt Test Day Tomorrow (April 14)
Just a reminder that tomorrow is Fedora Virtualization Test Day. Test plans and more information for the event can be found on the Fedora Project Wiki. IRC for the event is on freenode in #fedora-test-day.

Please do come along and help out with the testing, 'cause we've got plenty to test! :-)

(comment on this)

Wednesday, September 22nd, 2010
3:05 pm - Fedora 14 Virt Test Day!
Just a reminder that tomorrow is Fedora Virtualization Test Day. Test plans and more information for the event can be found on the Fedora Project Wiki. IRC for the event is on freenode in #fedora-test-day.

Please do come along and help out with the testing, 'cause we've got plenty to test! :-)

(comment on this)

> previous 20 entries
> top of page