What Ralph is up to
Week of Jan 31st
- Released official 2.2.13 kernel (2000/01/21) and put it in the
/pub/netwinder/kernel area. Will be removing the other
intermediates from my own kernel area shortly.
- PPPoE has become a high priority; trying to get as much testing done
as I can find people for... so far, Magma's service is passing with flying
colours, while Sympatico's seems to be causing trouble (at least for
some).
- Officeserver 1.2 (soon to be called 1.5) disk builds pretty much
stabilized; awaiting final QA.
Week of Feb 7th
- More officeserver work, tidying up small bugs, reviewing the
documentation, and the like... Still need to work the new config file
scheme into the autoupdater.
- The mysterious and sinister-sounding kernel error message, Trying to
free non-small page from 0xnnnn, seems to be showing up regularly
during shutdown for several of my machines. I'm still not sure what it means, or
if it has any run-time consequences.
Week of Feb 14th
- Final builds for os-1.5, not including possible firmware change to
correct the shaker problem.
- Began work on merging in 2.2.14 kernel into NetWinder cvs tree
Week of Feb 21st
- Kernel 2.2.14 kernel now booting and working reasonably well on the
NetWinder. Outstanding issues include the framebuffer lacking support for
non 640x480 modes, the 'buggy cpu' message, and the tulip driver
situation.
- Forwarded ptrace patch (from September) to Russell
- Looked at cuttlefish project (www.pushcache.com) to see how readily it
will move from squid1 to squid2 as the underlying cache service. Apparently
this is already "forthcoming" from the cuttlefish developers.
- Started looking at the diffs between the netwinder kernel (from CVS) and
the corresponding version from Russell King. There are lots of diffs
;P
Week of Feb 28th
- Preparing to commit the 2.2.14 kernel to CVS - a few minor fixes, and
revert the cyber2000fb to the previous version, until video mode selection
is code is working again
- Extended the auto-update code to support reading the rpm's from a CDROM
mounted via Samba from a windows workstation
- Tested the update-from-cdrom on a real windows network... it seems that
windows machines don't like being addressed by IP, though if you use their
hostname it works fine. Other than that, the update worked perfectly.
- Updated officeserver with new firmware for pci timing initalization to
support recent manufactured boards
Week of March 6th
- Checked in kernel 2.2.14
- Moved over to the new filer and updated all the cron jobs I could think
of
- Looked into problem with cross-reference tool on pyramid: it's running
out of disk space... Tracked it down to the cvslogprocessor script,
choking on branch name that it thought was a symbolic tag. Fixed for the
meantime.
- Made final touches for the auto-update cdrom issue and ran around making
sure all the required pieces are in place.
Week of March 13th
- Usual monday meetings, then worked on the cut-over of netwinder.org to
real netwinders. User accounts transferred and CVS repository moved over,
then enabled the new gateway. Fixed up all the intenal machines to use new
GW and nameserver. Tidied up some errors on the new HTML pages that were
crashing older browsers.
- Chased some officeserver auto-update bugs; the default index.html is not
preserved, easy to fix that. Revived engineering fileserver as cvs box and
transfered the ccc stuff to it. One disk is flakey, need to replace it.
All seems well so re-enabled external access.
- Revamped the backup process for cvs, flush all to common area on filer
and sync nightly to tape from there.
- Auto-update bug fixes: preserve htdig, discussion groups, and infoplace
data. Why this only started showing up now, I'd really like to know.
- New disk for eng. fileserver installed and system transferred over.
Week of March 20th
- Autoupdate bugfixes tested and released.
- Got Bugzilla 2.8 working at home, tried to duplicate here and it has
some problems. Seems the tables aren't intialized quite the same way.
Ongoing.
- Rebuild MySQL with -fpic option, otherwise it fails when perl tries to
access the database. Andrew looking at some other issues with the MySQL
packages.
- Started packaging FreeS/WAN for the NetWinder.
Week of March 27th
- Completed packaging of FreeS/WAN user space tools into RPM format.
- Started implementing cvs2rpm tool...
- The rest of the week passed in a blur
Week of April 3rd
- Looked at the vpn-masquerate patch by John Hardin. It's slated for
inclusion into 2.2.x mainline, but Alan's in "bug fix" mode and won't accept
the patch at the moment. Maybe we can wait...
- Some testing and fine tuning of the freeswan rpm package.
- Prepared a series of kernel patches "ready to go in". PC speaker,
ipsec, ip_masq_vpn, for starters.
- Got rid of most of the "RedHat"-isms in the bugzilla database, and most
functions seem to be working now.
Week of April 10th
- A few more bugzilla changes - new colors and banner images, changed some
URL's to point at netwinder.org.
- Added info about the -O sparse_super option for mke2fs in the
HOWTOs
- Looked at DPMS console blanking - works fine, use setterm to
activate it.
- Released 2.2.14 kernel with freeswan and pcsound patches. Mainly for
internal testing.
- Tried out the freeswan gui plugin.
- Added a few more nodes to the rc5 cluster.
Week of April 17th
- USB support in 2.3.99 - looked into the state of things, tried it out
with the only USB device I could find - a microsoft glow-in-the-dark
"intellimouse". Seems to work, all 5 buttons even. Only mystery now
is how that optical encoder works ;)
- Looked into 256MB support, won't currently work because there is no
space for vmalloc in the memory map. Reportedly will work with 2.3.99.
- Trying to duplicate RMK's troubles with the cyberpro interrupt and
bm-dma. After a day of kernel building and other fiddling, got pretty much
the same results: capture to the video DRAM works, but dma transfer to main
memory returns many junk pixel swaths. Need to review the bm-dma setup.
- Looked into problem reported by Mark Lord where ipchains -F foo
leads to an alignment fault. I'd say it's harmless in most cases, but there
is an underlying problem that is quite ugly. If the socket options are
passed an unaligned string, there will be an alignment fault. Reported to
Russell and Alan, awaiting resolution. One solution is to change
get_user call to a copy_from_user instead - this seems to
work.
Week of April 24th
- Trying out gmp-3.0 library on ARM (request from the ipsec guys)
- Worked on security pages. I think we are about ready to make them
public now.
- Fixed kernel problem with setsockopt() routine, triggered by ipchains -F
command input.
Week of May 1st
- Spend most of Monday moving :)
- Took a stab at building Reiserfs support into the arm kernel. The
kernel patches trivially, the user space utilities are a bit ugly since they
insist on using kernel macros for get/setbit operations (but on arm, these
are functions and not macros). Can kludge around it by adding -lc
../../../../arch/arm/lib/lib.a to all the link lines. Once this is
done, I can create a filesystem, mount it, but listing it returns endless
"no such file or directory" errors.
- Looked into parallel cdwriter problems. Current backpack drives
("series 6") are not supported by parport; vendor has binary only drivers.
I've contacted them in hops of getting arm binaries or source code.
- Reviewed PCI quirks with Rick and from the old list achives. This is an
old problem that is still around. It looks like its a bug in the arbiter...
looking into it some more.
- The JFS from IBM seems to patch and build quite cleanly. It even works,
mostly. Partitions can be formatted and files copied in and out, dirs
listed, etc. However it doesn't handle named pipes, and recursive delete
(rm -rf) doesn't work (it claims "directory not empty").
Promising, but not ready for prime yet. Also missing a filesystem
checker.
- Looked into openldap problems that Woody reported. I have the same
findings as him, namely the testsuite fails, but when I strace or ltrace it,
it works (there is a failure later on though). Ongoing investigation.
Week of May 8th
- Moving bugzilla once more, so we can put up the public side of it
safely. Also will make the kernel cross-ref public.
- Further investigation into pci lockup issue. Still can't point the
finger of blame specifically, but will try several variations now.
- Fixed kernel cross-ref, which wouldn't read the db files anymore.
Deleting and rebuilding them seems to fix the problem. Maybe hostname was
embedded or something like that.
- Freeswan package updated slightly for officeserver-1.6 (handling of
connections on startup, mostly).
Week of May 15th
- Mostly worked on PCI bugs. Winbond has admitted they are aware of the
problem we are seeing with the 553, and recommend the 554 as a drop in
replacement. I've tried it, and discovered that chip won't even boot the
system unless I disable the PCI retry enable bit at 0x40. But running any
DMA process to the 553 (disk or sound) still locks the system. Meanwhile,
Russell found some settings for the ISA host that reduce the trouble, so
audio plays reasonably well.
- Found problem with latest binutils (2.9.5.0.47), seems to choke on some
asm in the kernel's entry-armv.S. Spent a while finding a version that
works.
- Nasty bug in the python interpreter, which shows up when running
mailman's mail-to-news gateway. Blows up in glibc's chunk_alloc(), which
sounds very familiar for some reason. Will get Scott to check next
week.
- Worked on the website. The autobuild pages are now automatically
regenerated, and security/bugzilla are just awaiting IP addresses.
Week of May 22nd
- Enjoyed the Victoria day holiday immensely. The weather was nice, and I
finally managed to have charged batteries in the helicopter *and* calm
winds, so I was able to do some flying! I need more relaxing weekends like
this one. :)
- Tuesday, worked on web site, sorted out a minor Bugzilla problem, chased
various small bugs. Then the power failed, the whole building and several
others nearby. That pretty much killed the afternoon.
- Wed, launched the security and autobuild sections of the website. Also
fixed up some scripts that didn't comply with the current stylesheet and ssi
conventions. Tried out 256M on R6 with the 2.3.99-pre8 kernel; crashes if
you force the mem, otherwise only sees 128M.
- Thu, poked at the 256M support some more, discovered it needs changes in
firmware. Andrew made a quick stab at it which resulted in me running to
find the JTAG equipment. And wondering why my minicom doesn't want to
transfer files in xmodem anymore. Later a slightly improved firmware
version successfully booted 256M. No swap space though, suspected kernel
bug.
- Played with inn and xtradius on the Winder. The latter seems to have
problems authenticating users against the system password file. Perhaps
this is a shadow passwords problem; thought the code was build with
-DNOSHADOW, maybe it's still broken.
Week of May 29th
- Merged in kernel 2.2.x fixes for the msr cpsr_c, and noted that
the same should be done for msr spsr_c.
- Merged Sean's LED drivers, Scott's change to NWFPE initialization,
cyberpro init, into the 2.2 branch.
- Resurrected the kernel autobuild (the latest version hadn't been
commited, but thankfully the disk was still around)
- Patched suid capabilities bug and issued new kernel & advisory.
Week of June 12th
- Issued corrected kernel with fan driver working, and masq_vnc code
reinstated. Discovered problem with the old freeswan patch; regenerated it
and it worked fine. Wierd.
- Tested plip again. No change in status; in 2.2.13 it works so long as
tcpdump isn't running, in 2.2.14 no ping replies are generated. And when
you mix 2.2.13/14, replies are generated at one end but are never
received.
- Did more plip testing on Wed. Tried all the combinations I could, and
got a tad fustrated with the whole thing. Called it a day.
- Thurs, looked into freeswan-patch that stopped working: no actual
changes in source were found (good). Reviewed changes in net drivers
between 2.2.13 and 2.2.14 in hopes of finding change that might expain plip
problem (nothing obvious). Modified kudzu to issue RMK's magic ISA bridge
settings before playing the startup sound, so as to reduce the effects of
DMA garbling the sound.
- Rolled cvs2rpm into a package, and created nwleds package as a model.
The latter to be integrated into initscripts package for status monitoring
during fsck at startup.
- Delved deeper into the plip driver mystery. There is definately a
problem in the receiver code: the plip driver correctly receives a packet
and passes it up to the next layer. I investigated two layers up until i
loose the path where the skb is queued on a 'backlog' queue (even though
there is no backlog). So I tried looking from the top down. The ping
packet never shows up at the ICMP layer, nor is an invalid protocol message
received at the time. However pings received over ethernet to trigger my
test points. Close, but still not quite there.
- Looked at issue with latest my latest freeswan RPM, it seems to have
been built against wrong headers or with a bad compiler or something.
Rebuilt by Luc and seems to work fine; replaced my old RPMs.
Week of June 19th
- Plip driver fixed in 2.2.15; backported and got it working. I was down
to the very function that got patched in 2.2.15... next time, I'll go and
read Alan's notes first.
- Checked in vmalloc fix for 256M support
- Figured out why the freeswan patch got corrupted - it was CVS doing
keyword expansion, in particular on the $Log$ which freeswan seems to use
alot. Changed CVS to -kb and all is well now. Reverted back to the
original patch, since it was fine all along.
- Issued kernel-2.2.14-20000621 incorporating the previous changes
- Merged in 2.2.15-rmk1 patch into our CVS. Not commited yet, still
testing out some stuff. Issues include: serial number format, ptrace API
changes, mystery symbol EF_ARM_APCS26.
- Reviewed OS auto-update code modifications
Week of July 4rd
- Monday was a holiday - too bad it rained...
- Submitted kernel patches for 2.2.15 to RMK: cyberpro, waveartist,
vmalloc, LED driver, and NWFPE.
- Updated the x86 kernel RPM to include intel's version of the ePRO100
driver, which apparently works better with the boards we've got. Revision
number set to 2.2.16-nw1.
- Rebuild vnc on dm-3.1-15 image for customer, no problems encountered.
Old binaries renamed, and new ones put in place on ftp site.
- Disk usage getting pretty high, and there are a few users with very
large files in their homes - spent some time looking at them. Decided we
need to do something to avoid all the duplication of effors in RPM
building.
Week of July 10th
- Merged 2.2.16-rmk1, found a few problems, talked to RMK and resolved
them.
- Looked at problem with Apache+mod_ssl+php+mysql. The php mysql.so
library has PC24 relocs, despite a blatant -fPIC in the makefile. It links
in a static library (mysqlclient.a) which wasn't built with -fPIC. The
difference being that one as only PC24 relocs, whereas the one with -fPIC is
PC24 and 32 relocs interspersed. If we rebuild the mysql.so module with a
properly build libmysqlclient.a, everything works.
- Cleaning up ARM 2.2.16 kernel and checked into CVS. Noticed that the
non arm directories are a bit out of sync with the mainstream 2.2.16 -
posted about this on commitinfo - awaiting replies..
- Buried head in sand over the vpn issue. For all my whining about it,
nobody else sees the problem, so obviously I'm missing something. Oh well,
the good guys always finish last, right?
- Updated cvsweb scripts on yuri and trinity due to remote-root exploit
that was posted on bugtraq.
- Sean discovered that the 2.2.15 and 2.2.16 arm kernels won't recognize
128M of ram. Looked into this and found that the RMK uses a different
sanity check on the param structure - and I merged his way into our tree.
Either we'll have to revise the check, or ditch it completely.
- Decided to add additional checks for 128/256M. Comitted.
Week of July 17th
- Kernel target keeps shifting - need to nail down requirements before
doing any more work.
- Build and tested new nfs-utils package and issued NWSA for it
(formatstring remote root exploit).
- Attended OLS which was
a pretty cool show.
Week of July 24th
- Kernel SRPM for multi platforms still having troubles - base version now
builds on everything, but various add-ons have trouble.
- Fixed up the web indexing script to search redhat folders, and to handle
spaces in filenames.
- Continued working on multiplatform kernel SRPM. Now builds on arm and
x86 with all the basic patches. Unfortunately, there was no way to avoid a
few patches that needed to be separately generated for arm and x86.
- Worked on building freeswan support into sparc kernel. Several problems
identified and kludged around. There are some optimizer bugs that are cured
by using -O0 or -O for now. The sys/cdecl.h header doesn't define
__restrict due to a flub in the gcc spec file setup; kludge this by editing
the header. With these, can build an ipsec.o kernel module that loads.
- Building the freeswan user-space tools under sparc also led to problems
in the gmp library. Going to the new version (3.0.1) didn't cure them at
all. After lots of digging, I convinced it to work by adding #define
LONGLONG_STANDALONE in gmp/longlong.h and it built.
- Bringing up an ipsec connection failed. The problem is an ioctl
IPSEC_SET_DEV which passes a (struct ipsectunnelconf *) that doesn't arrive
correctly at the other end. Seems we'll have to add a fixup routine to
arch/sparc64/kernel/ioctl32.c, but there is already a conflicting entry in
there (SIOCGPPPSTATS, from the ppp device).
Week of Aug 8th
- Worked on GPIO assignments for the southbridge. All pins assigned now,
but still working out issues with flash addressing and selection. For the
time being, we'll require the 4M flash part.
- Meshed our pin assignments with those in gda spec, couple of
differences, memo written to try and unify them.
- Sparc box reimaged with RedHat 6.2, because it wouldn't boot a stock
kernel anymore (probably due to repeated e2fsck's with "-y" option). After
ironing out a few wrinkles in how to boot this thing, it is now working
again. Rebuilt the master kernel SRPM and resulting kernel also boots.
Didn't try freeswan yet.
- Investigated Strataflash, which looks to be replacing the current flash
part we're using. Main difference is the block size of 128k instead of 64k,
and several new features like suspendable-erase cycles. And of course the
programming interface is all different.
- Helped hardware team with figuring out the necessary CPLD functions
- Tidied up the master kernel SRPM some more. Rebuilt on ARM, sparc,
intel.
Week of Aug 14th
- More GPIO madness and reviewing of the overall design, CPLD changes, and
the like
- Retested freeswan on sparc, the pluto is working but klips is not.
Debug log sent to rgb. He noticed I was running two different versions of
freeswan (1.3 and 1.5).
- Upgraded sparc to freeswan-1.5. The ipsec.o module wouldn't load
anymore, it is too large. Stripping it helped, but the proc entries aren't
being registered for some reason, and so the ipsec setup script fails.
- Rebuilt x86 kernel for John with LFS enabled this time, see if this
solves the problems with large backups
Week of Aug 21st
- Customer site visit, replaced older rack with new one
- Freeswan testing/debugging (on sparc64), and helping luc with the x500
pki.
Ralph Siemsen / ralphs@netwinder.org
- Old reports