Archive for the ‘User Groups’ Category

SVLUG meeting: Next-generation Samba with John Terpstra

Wednesday, August 4th, 2010

At the Silicon Valley Linux Users’ Group (SVLUG), John Terpstra lectured on the development history and status of Samba, a high-performance storage project he worked on, and ClearOS.

John is a technology manager and co-author of The Official Samba-3 HOWTO and Reference Guide (Bruce Perens’ Open Source Series).

He has previously worked as a VP at TurboLinux and Caldera on Linux clustering products. (I vaguely remember those products from way back around 2000.)

Some of the Samba tips he gave were:

  • trim your samba configuration file down to essential settings
  • Samba’s ActiveDirectory capabilities enable large networks to scale beyond Microsoft’s implementation
  • network bandwidth consumption can be reduced by proper configuration of WINS and broadcast vs. anycast

John also mentioned that Microsoft is contributing to Samba through their effort to make various protocols available to all POSIX operating systems and also interop testing meetings.

He gave an interesting overview of a document discovery project that required an elaborate storage system. He was able to setup a working test environment with RHEL, LVM, GFS2 and DRBD and various filesystems before switching to Glusterfs on top of Solaris ZFS for more efficient handling of directory metadata with deep directory paths containing 800,000 files per directory. (There were approx. 3 volumes containing 14 TB each.)

Thanks to Symantec for hosting the meeeting once again.

Axceleon acquires Turbolinux’s EnFuzion Clustering Solution (2002)

O’Reilly Open Source Conference 2010, Portland

Friday, July 23rd, 2010

Once again, the O’Reilly Open Source Conference (OSCON) was held in Portland, Oregon.

It was a good conference, and we had beautiful weather all week long.

Executive Summary

The themes promoted by the conference organizers were Cloud Computing, NoSQL, Emerging Languages (Scala, Erlang, Parrot, Go) and Android phone development.

The @oscon twitter channel was heavily used to coordinate amongst organizers and attendees. I used the TwiXtreme twitter client program on my BlackBerry.

Plug Computers were very popular in the Expo area. They are 5 watt ARM-based computers running Debian Linux that fit into a power brick-sized case and cost $99 to $129 depending on features. The Marvell booth had a few models on display, from GlobalScale (GuruPlug) and Ionics. High-end models have dual gigabit NICs, multiple USB ports, a WiFi access point and other expansion ports.

There was also continuing buzz regarding Facebook’s Flashcache SSD module (GPL v2) for linux, and also ZFS snapshots.

Tutorials

I went to the Gearman Cookbook tutorial, the first half of the Chef tutorial and some of the Cloud Summit talks.

The Gearman Cookbook tutorial was excellent. After a detailed overview of the Gearman architecture and implementations in Perl and C, a number of use cases were explored in detail, including before and after code samples. The talk was both easy to listen to as an overall survey, as well as providing immediately useful info for those wanting to deploy it.

The Chef tutorial was very detailed – too much so perhaps. I went to the first half only, since I am not planning to implement Chef soon (I use PXE and anaconda/kickstart with CentOS), and did not need that level of detail at this time. cfengine, puppet and chef are ops tools for configuring servers. Chef uses Ruby data structures for its configuration files, and has include files and other useful syntax. Basically, users can “code” server configuration, as if they were traditional apps.

I went to some of the Cloud Summit talks and BOFs, but found that anybody who has done a simple project using EC2 knew as much or more than the speakers, some I would call blowhards.

Marten Mickos, president of Eucalyptus, is refreshing in that he is always clear about being in it for the money, while also promoting Open Source.

Sessions

Some of the most memorable sessions to me were:

Introduction to MongoDB, Kristina Chodorow (MongoDB)

Kristina is the maintainer of the Perl and PHP drivers for MongoDB. She gave an overview of MongoDB, a NoSQL document store, and its command-line interface, which uses JavaScript.

Some day she will release a sharding tool for MongoDB.

Scaling SourceForge with MongoDB, Nosh Petigara (10gen), Rick Copeland (SourceForge.net / GeekNet)

Nosh and Rick gave an excellent review of incorporating MongoDB into the SourceForge site.

- SF query load is mostly read-only
- ops team benchmarked a few NoSQL candidates, and MongoDB won on performance
- original MySQL servers had 64 GB RAM. After migration to MongoDB, same server machines but only 8 GB RAM
- backup dumps are verified to be bitwise the same as masters
- have to be careful not to dump all documents in your database to the network or it will max out switches
- SF relies on first-class data centers and replication slaves, less worried about MongoDB mmap (not crash-safe)
- I personally looked at their performance numbers and site graphs (on an iPad), and the end result was impressive.

Perl Lightning Talks

As always, the Perl Lightning Talks are a highpoint of the conference.

The “cartoon” of Vincent Pit’s remarkable CPAN module(VPIT) contributions was both informative and hilarious. Vincent is a French Ph.D. candidate in advanced geometry.

Cloud BOF (3 Hours)

The Cloud BOF was disorganized, starting 30 minutes late and for some reason was subdivided into 4 audience groups. Startups and vendors trying to make a cloud sales push led the BOF, including cloud and DNS service providers.

The Health Regulations subgroup came up with a couple ways to make the Cloud palatable to regulators by using encryption on all data due to the multi-tenancy issues with sharing public VMs.

I was in the NoSQL group, which discussed general issues and particular successes. Memcached was the clearest winner, while some people also had success with MongoDB and Redis.

My neighbor was an engineer at Postrank.com. He said that they were happy with HAProxy, but much less happy with the unpredictable IO available when running MySQL on EC2. He also said to carefully look at storage volumes available to your instance, as one is a useful tmpfs. They use AuthSMTP to get around EC2 being generally blacklisted for outbound email.

Database BOFs

MySQL BOF

The MySQL AB engineering staff has left Oracle. Monty Program AB (21 staff) has the core developers, and Percona Inc. (32 staff) has the consultants. Oracle still has some of the InnoDB programmers.

The business plan for Monty Program AB is 60% commercially-sponsored MySQL development, and 40% community-request development. Monty would like commercial users of MySQL to sponsor patches that would benefit them.

Mark mentioned that using Nehalem instructions for CRC were much faster, and that Facebook was using partitions for truncating tables instead of doing multi-record deletes. (See his blog for more details.)

One person mentioned using a commercial backup tool, R1Soft, that inserts a linux kernel module to allow filesystem snapshots. He said to carefully test backup and restore in your environment, especially for filesystems greater than 1 TB which may exceed certain block counter limits. Peter said that some of his clients had used it with varying success.

It worked for him in his environment, and the file browser allows selective file restore (he uses it to restore by priority where a system runs multiple applications.) It starts at $299 for the Standard Edition, and also has MySQL Add-on and Enterprise Editions.

PostgreSQL BOF

The PostgreSQL BOF talked about 30 or so changes that went into version 9.

One of the most exciting new features is a native replication feature, called streaming replication (block-based.) The advantage over Slony-I replication is that Slony-I is trigger-based, so has a variety of issues included inability to replicate DDL commands.

Some of the developers mimed replication events, which was rather amusing to watch. Yes, it was taped.

PostgreSQL is released under the PostgreSQL Licence, which is BSDish.

Peter Zaitsev, co-founder of Percona, organized 3 BOFs, including XtraDB, XtraBackup, Maatkit, Percona Server, Sphinx Search and Running Databases on Flash Storage.

Sphinx Search BOF

Andrew Aksyonoff, the original programmer of Sphinx Search (GPL v2), couldn’t make it to OSCON (the good excuse was that he was busy coding), so Richard Kelm (Sphinx sales/customer support honcho) and Peter filled in (Percona is a business partner with Sphinx, and many of Percona’s clients use it.)

Some of the attendees were existing users, like myself, and some from HP and other companies were looking for a large-scale search solution or alternative to Lucene.

Monty mentioned that the latest MySQL 5.1 should be used, as there have been a number of performance and reliability improvements. Full-text search is supposed to be 10x faster than 5.0, and replication is nearly bug-free by now.

Sphinx Search now has real-time index updates in version 1.1.0 beta. Another very nice feature is SQL+FS indexing.

Here is the full Sphinx 1.1.0 changelog.

Running Databases on Flash Storage BOF

The Running Databases on Flash Storage BOF had a combination of MySQL and Postgres users who have tested or used most of the SSD products: FusionIO, violin, Intel, OCZ, etc. Everybody was happy with SSD IOPS performance, but less so with cost and metadata RAM requirements with the add-in boards (FusionIO may require 4 GB RAM for metadata.)

Peter said that 20% to 30% of his clients are already using SSD – across the spectrum of vendors and models. Some are also trying “massive RAM” solutions, like Cisco servers with 384 GB RAM.

Some users had 1+ TB Postgres databases with very thorny backup and mgmt. issues. One solution was to start a snapshot, but not do the copy operation.

Expo Notes

I had an enjoyable talk with Austin Hook, who has operated the OpenBSD Store for many years. He lives near Calgary, the center of OpenBSD/OpenSSH/PF development. He mentioned that some perennial financial contributors had stopped because of the recession, so here’s the donations link.

I also talked to some reps from a Brazilian outsourcing firm, ActMinds. They currently have 400 employees across Brazil and a sales office in Philadelphia. Brazil is only 2 hours ahead of EST. They said the minimum project size is 2 developers and developer turnover a low 5%/annum. Their pricing is $35 to $45/hour.

And I had fun handling the plug computers on display at the Marvell booth. The Ionics boards are amazingly densely populated.

Discussions

I had the opportunity to talk to a long-time Portland resident who works as a computer consultant. He said that the Portland economy is not doing great, and really hasn’t done well since old-growth logging was stopped after 90% of the forests were cleared. And although hundreds of miles of fiber optic has been laid downtown, it’s not available for residential use. However, the Beaverton area does have ubiquitous FTTH.

I also talked to somebody who attended the Emerging Languages talks. He’s working on his M.Sc. in Computer Science, so found those talks fascinating.

Twitter Humor

There were some humorous tweets:

- “my MongoDB and CouchDB mugs are fighting each other.”
- “I got one MongoDB mug, but need two to safely store coffee.”

Notes

Note to self: skip the nightly parties unless you have a date. The bars are too loud to talk to anybody.

Note to the O’Reilly conference organizers: use meetup.com for the BOFs like ApacheCon does. The average audience was about 10 people, and with meetup it would be 4x that.

OSCON 2010 Slides
Tim Bray: Desperate Perl Hacker
Youtube: OSCON 2010 videos
blip.tv: OSCON2010 videos
wikipedia: Plug Computer
Jeremy Zawodny: MongoDB Early Impressions

SVLUG Meeting: Not Your Father’s Assembly Language with Randall Hyde

Wednesday, July 7th, 2010

At Silicon Valley Linux Users Group tonite, Randall Hyde talked a bout a more modern implementation of assembly language, HLA – the High Level Assembler.

He talked about his career as a programmer, college lecturer at UC Riverside, computer book author and developer of nuclear reactor control software.

It was interesting to hear first-hand that CS students during the dot com boom actually did enroll “just for the money”, regardless of interest in science or ability.

Originally his book on HAL was a download-only book, but No Starch Press was looking for content and actually contacted him for permission to publish it. It proved to be a popular book and another version is planned.

He said it takes about 2 years to learn the domain-specific knowledge about nuclear reactors, plus whatever time it takes to learn the programming languages or tools used for the project.

Using a debugger on nuclear reactor control software results in a scram, so planning ahead is a good idea.

Thanks again to Symantec for hosting the meeting.

sf.pm.org: Hudson for Everybody Else

Tuesday, June 22nd, 2010

Joe McMahon did a nice talk tonite at the San Francisco Perl Mongers (sf.pm.org) on the Hudson continuous integration server.

Hudson is written in Java, but can be used with any programming language (or documentation generator) where Makefile or JUNIT output is available.

He’s been happy with the included features so far. One of the features he’d like to try next is spawning a VM from Hudson.

As a Java application, it can be fairly memory-intensive. Hudson plus 400,000 tests requires about 4 GB RAM.

Although most Perl modules don’t require a compile-link step, CI can still be useful for Perl programs to:

  • automatically test across multiple platforms
  • automatically run test suites
  • integrate code from multiple developers
  • record build results in a common location for later analysis.

Joe also talked about cleaning up the output of Devel::Cover by excluding CPAN modules.

I mentioned during the Q&A period that `make -i’ can be used to force make to continue on errors.

Thanks to Mother Jones for hosting the event tonite.

Slides
CPAN ID MCMAHON

IMUG: Game Localization: Are We Having Fun Yet?

Thursday, June 17th, 2010

Tonite at the International Multilingual User Group (IMUG), Anthony Fitzgerald from SimulTrans gave a good talk, “Game Localization: Are We Having Fun Yet?”

Anthony has extensive experience in game localization, going back to Sierra On-Line’s Leisure Suit Larry franchise (challenging to localize because the user input was freeform text) and Valve Corporation’s Half-Life (requiring about 6 months to localize.)

Also many members of the game development community were there, including an Ubisoft localization project manager and SO-L’s fourth staff programmer.

Localization staff typically have to play the game for a week to understand how it works and what work needs to be done. :)

There are various localization levels for games:

  1. Just localize the box art
  2. Add subtitles
  3. Localize everything

For networked games, some additional issues are:

  • localize just the client (user) program, or also the server too?
  • what if player messages to each other are in different languages?

Cheats are very helpful to expedite testing when testing levels, features, etc.
Testing on the lowest supported resolution is the fastest way to find font and string problems.

When testing PC games, particular graphics cards may be needed. Localizing console games is more involved though, requiring a developer console and software development licenses, possibly costing thousands of dollars per console.

Besides the usual challenges in localizing products, games have 2 additional steps:

  1. for console games, manufacturer (Sony, Nintendo, or Microsoft) acceptance testing is required. 10 to 15 days need to be scheduled for the initial test run, followed by 5-10 days per round of additional testing for rejected games. Additional test runs are billed by the mfg.
  2. government ratings boards, like ESRB in USA.

Also, holiday deadlines for games are literally that: missing a holiday means that your audience has already spent their discretionary budget, so the release is fruitless.

Thanks to Google for hosting the meeting.

IMUG: Internationalizing Twitter

Thursday, May 20th, 2010

Mark Sanford did a comprehensive talk on i18n at Twitter.

He was on the Summize team that Twitter bought, so originally worked on search, then ended up being the i18n guy.

Twitter had a lot of things stacked against it i18n-wise:

- unilingual source tree
- no budget to do any real engineering initially
- Japanese tree translated by outside volunteer partner, Digital Garage
- primitive Unicode support in MySQL and Ruby
- now a legacy system full of data, hard to add metadata like lang/locale at this point.

Japanese cell phone support was challenging because:

- Shift_JIS, not Unicode
- each of 3 carriers uses a different image format and different emoji codepoints, some overlapping
- lack of l10n resources in house for Japanese-flavor design, including being cute, dense and also having an ad to demonstrate business seriousness
- Japan uses cellular emails, not SMS like other places
- mobile browsers don’t support cookies, so URL sessions needed unlike the regular Ruby web app
- hard to tokenize short messages in Japanese.
- need QR (Quick Response) code support (2D barcode)


QR Code

QR Code

Remarkably, Twitter was invited by a number of carriers to support their phones.

Crowdsourced translations using Google Groups, interns and app integrated with Twitter site. Now up to 3,700 strings and 2,600 translators. Hard to translate informal terms like tweet and follower though.

There was very good turnout, with 60 attendees live and 6 online.

Thanks to Adobe for hosting the venue. (Though I don’t understand why there is a 3-year NDA for attending a public meeting.)

techcrunch.com: Twitter Has Basically Doubled In Staff In The Past 6 Months (June, 2010)
Official Twitter Blog

SVLUG: IPv6 Essentials for Linux Administrators with Owen DeLong

Wednesday, May 5th, 2010

At the Silicon Valley Linux Users Group (SVLUG) talk tonite, Owen DeLong from Hurricane Electric did a good talk on “IPv6 Essentials for Linux Administrators.”

Owen is the IPv6 evangelist for Hurricane Electric, an Internet hosting and network services company with 2 data centers in Fremont, 1 in San Jose, and approximately 30 POPs world-wide.

There is urgency to improve IPv6 support and adoption as:

  • IPv4 will run out of /8 blocks available shortly (2011), resulting in scarcity
  • China and other countries are rapidly moving online and require (demand) addresses
  • yet there is a long lead-time to deploy IPv6, perhaps 5 years for a company that hasn’t started preparations.

He mentioned some interesting “tricks”, including:

  • using an ssh tunnel to bridge IPv4 and IPv6 networks

He also does a separate talk on “IPv6 Essentials for Programmers.”

Owen mentioned after the talk that some of the scripting languages have poor support for IPv6, including Perl.

Thanks once again to Symantec for providing a meeting space.

HE Tunnel Broker Service
brad’s life – IPv6

Zend PHP Conference 2009

Thursday, October 22nd, 2009

The Zend PHP Conference was downtown at the San Jose Convention Center, so I went to that this week.

It was a well-organized, fun-sized conference – just big enough to use half the convention center, which made it easy to get around without a lot of walking between session rooms.

There was also an official, parallel unconference in 2 rooms priced at $199 for non-conference attendees.

The talks were high-quality, the food was great, and wifi worked everywhere. What more could one ask for? Well, a few more power strips next time, perhaps.

I was impressed with the number of attendees from Europe, Australia – and Utah!

I went to Matthew Weier O’Phinney’s tutorials on Monday. He’s the Project Manger for the Zend Framework, including design and supervision of the framework programmers. He’s an excellent speaker and really knows his stuff. Both his Intro to Zend Framework, and Ajax with Zend Framework tutorials were excellent.

My favorite talk of the conference was Eric Farrar’s talk on Mobile Data Synchronization. His slides went through many of the pitfalls of data synchronization, then actually provided a solution: use Sybase’s mobilLink, which is free to use with MySQL and SQLAnywhere. He said a team of 24 has been working for 10 years on that, and it is deployed in millions of devices. He works on the ultraliteweb project.

A Digg sysadmin did an interesting operations talk on the evolution of the Digg data center over the past few years. They’re up to 800 servers in 2 Equinix locations now, and use pre-cabled racks of servers from Penguin Computing. Software-wise, they like Cassandra key-value pair, clusto and puppet. They tried some commercial software in 2007, and didn’t enjoy the experience.

I had some great lunch break talks with other folks. One guy from Ohio was getting interesting SEO results by serving raw XML to clients, and having client-side JavaScript provide styling for human users.

I talked to a couple folks about their experiences using MySQL NDB Cluster in production. They both said it’s flaky, with one having already abandoned it for regular MySQL database with Innodb. He was also using RightScale and Amazon for document processing, and was happy with that combo.

There were about 20 exhibitors in 2 aisles, so easy to talk to all of them.

I got personal demos of RightScale’s cloud admin app, Zend Studio IDE, and BCDSoftware’s WebSmart PHP code generator.

WebSmart PHP is a $4600 code generator for ex-RPG and COBOL programmers. It provides a basic IDE, but the interesting part is that whatever you might want to do is either documented in hundreds of online technote examples, or available by contacting their unlimited support department.

Some of the unconference talks I went to included improving cookie security by embedding the SSL session id, and part of the continuous integration session (they talked about Hudson and CruiseControl, but not BuildBot).

The unconference talk on PHP and queues was quite good, with an overview of Amazon Simple Queues (good), Gearman (no persistence), beanstalkd (rave), and custom PHP and C queues (don’t roll your own unless you want long-term job security.)

The closing keynote was what I was mainly at the conference for … the PHP Frameworks Shoot-out with the framework project leaders.

Here’s my notes from my perspective as a listener. Please email me with any corrections.

Agavi
- David
- borrow from Symfony PHPunit code
- would use Symfony as alternate
- CI ORM is a pointless reimplementation, Rails programmers are morons shaped by pragmatism of Rails model
- hates complexity of validation code, context from Majove too many interdependencies
- 5.3 nice to have universal exception handling fw
- believes 5.3 is a major new release not comparable to 5.1 or 5.2 that frameworks need to support
- bigger the team and complexity, better agavi is because more structure

CakePHP
- Nate Abele
- hates long class names
- hates ACL system needs to be redocumented or cleaned up
- PHP 4 at this point, next release on 5.3

CodeIgniter
- Ed Finkel
- Symfony generates too many files, brain hurts; input filtering in ZF overcomplicated
- hates complex routing, unlike Limonade
- CI is not recommended for complex systems

Symfony
- Fabien Potencier
- French
- full stack
- secure by default
- would use Django and ZF
- hates 1.1 form framework complexity that users painpoint
- 5.3 is nice, but no plans to port to 5.3 because of large users update cycle time

Zend Framework
- Matthew Weier O’Phinney
- ZF routing from Rails, lots of stealing
- would use CodeIgniter
- hates heavy-weight dispatch cycle in ZF, to be rewritten in 2.0
- 5.3 ZF already testing with it, backwards compatible to 5.2.

The sessions that had an audio recording will be released as podcasts, one per week, and hosted on devzone.zend.com.

Thanks to Zend for organizing a great conference.