Archive for the ‘Open Source’ Category

SVLUG Meeting: Not Your Father’s Assembly Language with Randall Hyde

Wednesday, July 7th, 2010

At Silicon Valley Linux Users Group tonite, Randall Hyde talked a bout a more modern implementation of assembly language, HLA – the High Level Assembler.

He talked about his career as a programmer, college lecturer at UC Riverside, computer book author and developer of nuclear reactor control software.

It was interesting to hear first-hand that CS students during the dot com boom actually did enroll “just for the money”, regardless of interest in science or ability.

Originally his book on HAL was a download-only book, but No Starch Press was looking for content and actually contacted him for permission to publish it. It proved to be a popular book and another version is planned.

He said it takes about 2 years to learn the domain-specific knowledge about nuclear reactors, plus whatever time it takes to learn the programming languages or tools used for the project.

Using a debugger on nuclear reactor control software results in a scram, so planning ahead is a good idea.

Thanks again to Symantec for hosting the meeting.

MySQL Privilege System Still a Mess in 2010

Tuesday, July 6th, 2010

It’s already 2010, but the MySQL privilege system has been a mess for over a decade.

Most DBAs are aware that under heavy connection load, the MySQL internal resolver can have problems resulting in login failures, if you don’t use skip-name-resolve.

But I found what appears to be another serious bug …

After issuing a GRANT to create a new user with a wildcard hostname like ‘nagios’@'%.domain.com’ and REPLICATION CLIENT privilege recently to 10x 5.1.30-pro lightly-loaded slaves on CentOS 5.4 without skip-name-resolve, one of the slaves stopped accepting remote connections from any user name. (Local connections still worked fine for all users.)

That’s right … the only change was a GRANT.

Execute the GRANT command …


mysql> GRANT REPLICATION CLIENT on *.* to nagios@'%.domain.com' IDENTIFIED BY 'password';

On a remote server …


$ mysql -u user -ppassword -h hostname
Error 1045: Access denied for user 'user'@'hostname' (using password: YES)

The only thing a little odd about that machine was more than 1 hostname or domain name for that host.

So what can one do to lessen occurrences like this, or at least not get bitten as hard?

  • disable hostname lookups with skip-name-resolve
  • preconfigure grants before going into production
  • expect the unexpected when changing grants in any way
  • know how to quickly and cleanly shutdown the mysql instance and restart it, ideally with startup scripts.

How can one diagnose MySQL privilege bugs?

  • try connections from localhost and remotely
  • write a test script to attempt remote connections to help isolate problems
  • do show full processlist and look for login states or other odd entries
  • mysqladmin flush-hosts to reset the internal DNS host name cache.

MySQL Manual 5.0: 5.4.7. Causes of Access-Denied Errors
MySQL Manual 5.1: 5.4.7. Causes of Access-Denied Errors
MySQL Manual 5.1: 7.5.11. How MySQL Uses DNS
Jeremy Zawodny: Fixing Poor MySQL Default Configuration Values (2001)

IMUG: Game Localization: Are We Having Fun Yet?

Thursday, June 17th, 2010

Tonite at the International Multilingual User Group (IMUG), Anthony Fitzgerald from SimulTrans gave a good talk, “Game Localization: Are We Having Fun Yet?”

Anthony has extensive experience in game localization, going back to Sierra On-Line’s Leisure Suit Larry franchise (challenging to localize because the user input was freeform text) and Valve Corporation’s Half-Life (requiring about 6 months to localize.)

Also many members of the game development community were there, including an Ubisoft localization project manager and SO-L’s fourth staff programmer.

Localization staff typically have to play the game for a week to understand how it works and what work needs to be done. :)

There are various localization levels for games:

  1. Just localize the box art
  2. Add subtitles
  3. Localize everything

For networked games, some additional issues are:

  • localize just the client (user) program, or also the server too?
  • what if player messages to each other are in different languages?

Cheats are very helpful to expedite testing when testing levels, features, etc.
Testing on the lowest supported resolution is the fastest way to find font and string problems.

When testing PC games, particular graphics cards may be needed. Localizing console games is more involved though, requiring a developer console and software development licenses, possibly costing thousands of dollars per console.

Besides the usual challenges in localizing products, games have 2 additional steps:

  1. for console games, manufacturer (Sony, Nintendo, or Microsoft) acceptance testing is required. 10 to 15 days need to be scheduled for the initial test run, followed by 5-10 days per round of additional testing for rejected games. Additional test runs are billed by the mfg.
  2. government ratings boards, like ESRB in USA.

Also, holiday deadlines for games are literally that: missing a holiday means that your audience has already spent their discretionary budget, so the release is fruitless.

Thanks to Google for hosting the meeting.

PENLUG Meeting: Linux Open-Source Virtualization Roadmap

Wednesday, May 26th, 2010

Jamie Cameron, the author of Webmin, did a talk on linux virtualization at Peninsula Linux Users Group (PENLUG) in the Bayshore Technology Park in Redwood City tonite.

He’s working on 2 new products, Virtualmin and Cloudmin, so has had to learn the ins and outs of the current state of linux virtualization with respect to hosting.

His favorite is Xen, but for some reason Redhat is providing more support for KVM (Kernel Virtual Machine), which has several disadvantages including lack of CPU limiting. Redhat acquired KVM resources in 2008.

OpenVZ is popular with budget hosting providers, and Virtuozzo with those that want to pay.

Linux-VServer is the lightest weight alternative, similar to FreeBSD jails, but also the least maintained at this point.

He gave a demo of Cloudmin, including creating a guest and logging into it.

Since Linux has no ABI standard, he prefers developing in scripting languages like Perl for maximum portability.

wikipedia: webmin
Ganeti is a “cluster virtual server management software tool built on top of existing virtualization technologies such as Xen or KVM and other Open Source software.”

ClusterIt dtop Command Ported to Linux

Tuesday, May 25th, 2010

The light-weight ClusterIt toolkit mostly worked on linux, but dtop (distributed top) still expected BSD-style top syntax.

Here’s a diff I wrote to make dtop work on recent versions of Linux (tested on CentOS 5.5 x86_64):

$ diff dtop.org.c dtop.c

311a312,322
> 	char buf2[30];
>
> 	if (strstr(c, "Swap:") != NULL) {
> 		sscanf(c, "Swap: %30s total, %*s used, %30s free", buf, buf2);
> 		nd->swap = dehumanize_number(buf);
> 		nd->swapfree = dehumanize_number(buf2);
> 		nd->inactmem = nd->wiredmem = nd->execmem = 0;
> 		return;
> 	}
>
>
344a356
> #if ! defined(__linux__)
364a377
> #endif
470a484,486
> #if defined(__linux__)
> 		case 11:
> #else
471a488
> #endif
517a535,539
> #if defined(__linux__)
> 		} else if (strstr(c, "Tasks:") != NULL) {
>                         sscanf(c, "Tasks: %d ",&nodedata[nn].procs);
> #else
519a542
> #endif

The output of dtop on linux looks like this:

HOSTNAME  PROCS  LOAD1  LOAD5 LOAD15 ACTIVE  INACT   FILE   FREE SWPFRE SWUSED
  g00-int     64   0.11   0.04   0.01	   0	  0	 0  7345M  2047M  0.00%
  g01-int     64   0.00   0.01   0.00	   0	  0	 0  7033M  2047M  0.00%
  g02-int     61   0.08   0.02   0.01	   0	  0	 0  6980M  2047M  0.00%
  g03-int     64   0.16   0.06   0.01	   0	  0	 0  7011M  2047M  0.00%
  g04-int     64   0.04   0.04   0.01	   0	  0	 0  6996M  2047M  0.00%
  g05-int     61   0.02   0.01   0.00	   0	  0	 0  7424M  2047M  0.00%

Here is the final, hardened version of dtop.c that uses secure C programming techniques (strn API and double-free safe.)

My long-term preference would be to rewrite dtop in Perl since parsing text input in old-school C is brittle.

Also, dtop should be able to handle top results from heterogeneous systems, and the linux ifdefs contribute to preventing that.

And here are some of the debugging commands I used:

ulimit -S -c unlimited > /dev/null 2>&1
valgrind -v --leak-check=full --show-reachable=yes --track-origins=yes ./dtop
gdb ./dtop core

Dan Saks: Why size_t matters
Karpov: About size_t and ptrdiff_t

Mapreduce and Hadoop Links

Sunday, May 23rd, 2010

This is a placeholder post for Mapreduce and Hadoop links.

(I operate a small 64-core cluster, and am always looking for ways to keep it busy with FOSS like Hadoop.)

hadoop.apache.org Cluster Setup
Cloudera.com
Mapreduce & Hadoop Algorithms in Academic Papers (3rd update)
Sandia: MapReduce-MPI Library
columbia.edu: Alex K’s Mapreduce Bibliography

wikipedia: SGE 6.2 supports Hadoop
wikipedia: Rocks Cluster Distribution

A grain of wisdom is worth an ounce of knowledge, which is worth a ton of data. — Neil Larson

IMUG: Internationalizing Twitter

Thursday, May 20th, 2010

Mark Sanford did a comprehensive talk on i18n at Twitter.

He was on the Summize team that Twitter bought, so originally worked on search, then ended up being the i18n guy.

Twitter had a lot of things stacked against it i18n-wise:

- unilingual source tree
- no budget to do any real engineering initially
- Japanese tree translated by outside volunteer partner, Digital Garage
- primitive Unicode support in MySQL and Ruby
- now a legacy system full of data, hard to add metadata like lang/locale at this point.

Japanese cell phone support was challenging because:

- Shift_JIS, not Unicode
- each of 3 carriers uses a different image format and different emoji codepoints, some overlapping
- lack of l10n resources in house for Japanese-flavor design, including being cute, dense and also having an ad to demonstrate business seriousness
- Japan uses cellular emails, not SMS like other places
- mobile browsers don’t support cookies, so URL sessions needed unlike the regular Ruby web app
- hard to tokenize short messages in Japanese.
- need QR (Quick Response) code support (2D barcode)


QR Code

QR Code

Remarkably, Twitter was invited by a number of carriers to support their phones.

Crowdsourced translations using Google Groups, interns and app integrated with Twitter site. Now up to 3,700 strings and 2,600 translators. Hard to translate informal terms like tweet and follower though.

There was very good turnout, with 60 attendees live and 6 online.

Thanks to Adobe for hosting the venue. (Though I don’t understand why there is a 3-year NDA for attending a public meeting.)

techcrunch.com: Twitter Has Basically Doubled In Staff In The Past 6 Months (June, 2010)
Official Twitter Blog

Google App Engine and Perl

Sunday, May 16th, 2010
Google App Engine Logo Placeholder blog post for Google App Engine and Perl support links.
Google App Engine homepage
code.google.com: Perl App Engine
code.google.com: App Engine Issue #34: Add Perl Support
Brad: Perl on App Engine (July 22, 2008)
Perl App Engine Status Update (July 30, 2008)