Archive for the ‘Business’ Category

EclipseJet Saga Continues

Friday, November 14th, 2008

I followed the Eclipse Aviation saga from the beginning: promise of a $850k jet 5 years ago, mass production of personal jets, “friction-stir welding.”

A lot of people in the aviation industry were convinced this plane would never make it to production.

I knew that it would, but assumed the final production price would be a little higher. I always thought DayJet was a shill company of Eclipse to help write press releases.

Well, imagine my surprise when I saw TV video footage a few months back of DayJet’s operation, replete with dozens of Eclipse 500 jets. DayJet was real afterall!

DayJet has since closed, blaming equally financing problems and Eclipse roll-out problems.

What’s fascinating is that Eclipse is selling the DayJet 28-Eclipse 500 inventory with the information that each airframe has 150 to 450 cycles. So it looks like DayJet really did some flying.

Eclipse has apparently missed payroll this month, so we’ll see what happens next.

avweb.com: Eclipse Selling Off DayJet Fleet
avweb.com: Analysts Grim On Eclipse Future
avweb.com: Eclipse Shutdown Predicted
avweb.com: Eclipse Misses Payroll: TV Report
kob.com: Unpaid Eclipse employees worry about future
South Florida Business Journal: DayJet Restart Speculation

OSCON 2008, Portland

Friday, July 25th, 2008

I attended the O’Reilly Open Source Conference, once again in Portland, Oregon.

Overall my impression was that the talks and vibe were oriented towards Web 2.0 primarily.

I would say that the talks were not as strong as previous years, but it’s easy to compensate for that with the “hallway track” and access to the original Open Source authors.

Several attendees used the EEE sub-notebook computer, and were happy with it as a email/browser tool.

Wednesday

PHP Taint Tool: It Ain’t a Parser

- CS’y effort at PHP parser for code analysis, reminds me of early days of Perl’s B tools
- not suitable for end-users

Write Beautiful Code (in PHP), Laura Thomson, Mozilla

- good general background on good programming practises
- not a lot of specifics about PHP, but available for questions

Hypertable, Doug Judd, Zevents

- HyperTable is a clone of Google’s BigTable, from public paper
- room was packed, some turned away
- still alpha, maybe beta in August
- preferred distributed filesystem is HDFS, works with others
- I recommend reading web site and then looking at the curt slides
- plans to do benchmarks with same hardware as Google has published.

Open Source Virtualization for People Who Feel Guilty About Using VMware So Much, andy michelle, EDA

- cute talk about VirtualBox, Xen and VMware
- Xen has weird nomenclature compared to other tools
- VMware wins on tools and polish
- showed screenshots of unreleased and alpha mgmt. tools.

Barely Legal XXX Perl, Jos Boumans, RIPE

- stunning and twisted example of overloading, short-circuiting, import-faking, whatever it takes to make a loaded module do something other than intended
- illustrates great flexibility of perl, for good or ill
- could be useful for things like testing harnesses, etc.
- motivated to win bet of $100 or 1 vertical meter of beer
- said it took 3 or 4 hours to complete.

I walked around the exhibits area.

Got a demo of Atlassian’s continuous integration (CI) tool, Bamboo. They’re also the vendors of JIRA issue tracker and Confluence wiki, which I’ve used before.

One company had a public Wii game happening.

Thursday

Scaling Databases with DBIx::Router, Perrin Harkins

Ultimate Perl Code Profiling, Tim Bunce (Shopzilla)

- talk and screenshots about NYT perl profiler


The New York Times Perl Profiler

Top 10 Scalability Mistakes, John Coggeshall (Automotive Computer Services)

- good overview of writing high-performance, maintainable Internet systems
- interesting opinion that scalability is not just about increasing performance. scalability can be about scaling up or down, performance or maintainability, etc.
- recommended php.ini settings list

Perl Lightning Talks

- popular with audience, attendees seemed to like all the talks
- Mail::ESMTP looks very interesting for testing and production

Code is Easy, People are Hard: Developing Meebo’s Interview Process, Elaine Wherry (meebo)

- struggled to find time, right approach to interview new candidates in 1996, likely at behest of VCs
- external recruiters hit-and-miss, conferences and jobs email link useless
- phase where non-founder employees doing interviews wanted a founder involved in interview process
- trying to preserve culture (finger rockets, social networking, 2 female founders, etc.)
- came up with process involving reading resumes, phone screens, and office “sim” that adds a new candidate within 3-6 weeks
- “sim” has 3 versions: office manager (plan to erect a meebo office sign), front-end engineer (write a JavaScript app), and back-end engineer (write a server) in 4 hours
- current goal is to keep interview time down to 8 hours per candidate over 10 days
- now up to about 40 employees
- my feeling was that their hiring process started off clueless due to inexperienced mgmt. and is still oriented towards junior engineers. Silicon Valley is full of expert engineers and it doesn’t take 8 hours to interview them.

BOF

mysql-sandbox

Giuseppe Maxia discussed and demoed his very useful mysql-sandbox utility for managing several versions and instances of MySQL on the same machine.

He wrote it for his testing work at MySQL AB. Very well received by attendees. This is a great example of what I call “anti-virtualization” - using ports instead of resource-intensive VMs.

MySQL Conference 2008 Presentation

State of the Onion Address, Larry Wall

- talk about Perl6, random anecdotes, etc.

Friday

Open Voices, Jim Zemlin (The Linux Foundation), Keith Bergelt (Open Invention Network), Karen Sandler (Software Freedom Law Center), Phil Robb (Hewlett Packard)

- panel discussion of various free software efforts, some little-known

An Illustrated History of Failure, Paul Fenwick (Perl Training Australia)

Paul gave an interesting talk on notable Software Failures and estimated a price tag for each. I had heard news reports of many of them, but it was interesting to hear an updated analysis of what really happened behind the scenes.

Thanks to Google for sponsoring the fairly good almost-gourmet lunches. Sure beats the O’Reilly lunchbags from the dot bomb days. (Everybody I know bailed and found a subway shop back then.)

Notes

- Burgerville popular with attendees, can upgrade combos to a shake.
- Red Lion hotel has a small cardio gym with 1 universal machine, no free weights, open til 11 pm
- WiFi password changed weekly, in middle of remodel, lobby just finished.
- There is a 24-Hour Fitness that is actually open 24 hours near downtown Portland. Has basketball court and 2-lane pool. $15 for non-member visitors.

OSCON 2008 Presentations

MySQL Conference 2008

Thursday, April 17th, 2008

I attended the MySQL Conference once again at the Santa Clara Convention Center.

Despite the January purchase by Sun, the conference had the same great vibe as usual, and everybody showed up again.

Top Conference Themes

Some of the conference themes I noticed are:

  1. Linux LVM snapshots are now popular for MySQL backups. Snapshots have long been used in enterprise IT, now it works well and for free on Linux. Another use for a snapshot backup is to copy a busy master offline for comparing to a slave with Maatkit mk-table-checksum.
  2. DRBD is popular for HA (one speaker wondered if his talk about DRBD was still relevant since everybody was already using it), but I see some drawbacks.
  3. developers are concerned with supporting massively multi-core CPUs in both database and cache code. The Sun Niagara multi-core architecture seems to be the future, with 128 or more threads.
  4. databases and storage are quickly increasing in size, so many DBAs are interested in MySQL 5.1 partitioning and other tools and techniques.
  5. cloud computing is entering common use with lots of Amazon EC2 and S3 users. mosso.com has been available for a couple years from Rackspace, and Google and IBM are entering cloud computing. Users would like more competition to reduce prices and improve reliability, SLA or not.
  6. most companies say it’s hard to find experienced MySQL DBAs, but most are lazy when it comes to training and compensation.

Top MySQL Conference Tips

  1. Disable swap if your version of Linux supports it (most do.) This avoids getting some of MySQL’s pages swapped out and crippling the box with IO.
  2. Use memcached. MySQL User Defined Functions (UDFs) for memcached are now available to auto-populate memcached from MySQL statements.
  3. Consider multi-level partitioning schemes with MySQL 5.1, like combining RANGE and KEY.
  4. Consider MySQL High Availability (HA), either with replication, DRBD, or both. Linux HA project.

Top MySQL Conference Misconceptions

I had to straighten out a lot of newbs:

  1. statement-based replication is not reliable, does not have checksums or two-phase commit, and masters and slaves tend to diverge over time
  2. many novices believe Innodb does row-locking only, but often does range and table locking
  3. mysqldump is ok in most cases, but you have to be careful with locking and locktime, matching charsets, and testing the dump.

Here’s my notes on some of the talks I attended. (In case you haven’t read my blog before, I’m a long-time user of MySQL, replication, LVM and memcached. I have not used DRBD.)

If you have a correction or improvement, please leave a comment and I will update this blog entry.

Monday (Tutorials)

Building Scalable & High Performance Datamarts with MySQL, Tangirala Sarma

Discussed general DW concepts at first.

3 main requirements for a succesful DW project are:

  1. good data quality
  2. the right tools
  3. phased results.

Talked about various partitioning schemes in MySQL 5.1. I’ve used it for about 6 months in one project, but most of the audience was new to MySQL partitioning and struggled to understand beyond RANGE partitioning for logging it seems.

Later talked about MySQL-particular DW aids, including:

  • Kickfire appliance, which has capabilities such as column-store, compression and fast loading.
  • Infobright, which also has column-store and fast loading.
  • Nitro

His recommended references are:

  • DW Toolkit, Kimball on Amazon.com
  • Enterprise DW with MySQL, MySQL AB
  • MySQL Roadmap 2008-2009, MySQL AB.

Also, there’s a list of books here:

DW and BI Starter Books

Queued up for sandwiches and salad. Not really a surprise with O’Reilly as the conference organizer, expected more.

Ate lunch with the DRBD programmers. They said that MySQL AB now provides 1st- and 2nd-level support, and their company provides 3rd-level support and cashes checks. :)

Memcached and MySQL: Everything You Need To Know, Brian Aker (MySQL), Alan Kasindorf (Six Apart)

Very detailed talk about tips and issues with using memcached.

  • Evolving online notes
  • Brian wrote about 30 man pages for memcached, edited by Mark Atwood. Unusual amount for an Open Source project.
  • memcached is very handy for people stuck with databases on 32-bit systems and a lot of otherwise unaddressable memory.
  • Patrick from Grazr has written MySQL UDFs to populate memcached, has a SoC student. Pipelines cache inserts. Handy for distributed DCs already using replication, triggers.
  • Postgresql has pgmemcache()
  • lighthttpd has mod_memcache, prolly url hash key
  • Apache has mod_memcached with CAS, GET/PUT/DELETE, still alpha, try at pandoraport.com
  • limits: key size 250 bytes, data size 1 MB, 32/64 bit limits
  • threading is new based on giant mutex, bad for more than 8 cores
  • stop swapping with -MLOCKALL, noswap, sizing
  • stats sizes to test efficiency
  • command line option to disable LRU
  • CRC most consistent hashing, normal
  • ring consistent hash
  • IP takeover
  • bad switches or intermittent network is very bad
  • pick a driver than can do multiget
  • ghetto lock
  • Tim Bunce’s Cache::Memcached::Libcache does not do Storable, which is what most people prolly want
  • time in seconds < 30 days is relative, > 30 days is absolute
  • namespace trick: versions in key name
  • uint32_t type parameter usually indicates whether compressed or storable, can be used for anything though.
  • memcached_tool: memcp, memrm, memstat, memslap (load testing)
  • showed Mixi MRTG graphs: 6800 reads/second, 200 servers, no CPU load
  • Brian added IPV6 support after mysqld update (pet project, but helped with multi-interface support and to optimize out name resolution)
  • binary protocol code available, not merged yet, helps with multi-byte charsets than embed spaces or newlines which break the text protocol
  • improvements needed are durable to disk, highly-threaded
  • persistent connections are good and recommended
  • UDP alpha code available, good for lots of sets
  • storing BLOBs with MogileFS or LUSTRE good
  • speakers did not have experience with commercial caches, but did say that most people find Java caches often too featureful and slow
  • MogileFS, Hypertable, Hbase interesting

Tuesday

EXPLAIN Demystified, Baron Schwartz (Percona Inc.)

Room was packed for this talk. Good step-by-step talk for understanding EXPLAIN better.

Replication Tricks and Tips, Lars Thalmann (MySQL), Mats Kindahl (MySQL)

Some good tips, but overall an assumption is made by MySQL AB that a MySQL master and slave actually have the same data.

  • mysqlbinlog has –hexdump option for seeing byte-level dump
  • examine both binlog and relay log when debugging replication
  • you can clone a slave from another slave if you trust it - just do STOP SLAVE, SHOW SLAVE STATUS, shutdown, copy over the files, SET MASTER, and START SLAVE

Dramatically Improving MySQL Database Performance in Data Warehouse Applications, Martin Farach-Colton (Tokutek)

His view is that the storage engine is the primary bottleneck in BI systems for loading and search, and showed how to use a B-tree to organize data to guarantee the maximum performance in a growing DW. (Not sure why the online summary talks about fractal trees.)

How to Achieve Operational BI on a Budget, Lance Walter (Pentaho Corporation)

Kind of overly Pentaho-product oriented, since Lance is a product manager for Pentaho.

However he made a useful distinction between historical and operational BI.

Historical BI is mainly reporting on what happened before yesterday, and operational is up-to-the-minute for business process analysis and improvement.

By designing your BI system for both historical and operational requirements, you can get both at the same time.

Interesting case study of the US Navy doing BI on pilot training and operations to reduce accident risk

MySQL Backup BOF (hosted by Zmanda rep)

- admins attending generally unaware of LVM, but awareness growing
- one guy split his database across multiple databases for easier mgmt. and uses a HP SAN with 1 TB of RAM, very happy with IO performance, not so happy with price.
- one guy using 100 EC2 instances and S3. ok except for recent outages, maybe a little pricey.
- one guy using dd for fast network copies
- Zmanda just integrates existing techniques and allows scheduling, but prolly quite useful for inexperienced DBAs to do point-in-time recovery and schedule backups.

MARIA BOF

Hosted by Monty with his usual wicked Finnish black vodka.

He said Falcon was supposed to take 3 months, but slipped, so work started on MARIA to replace MyISAM. He has made promises to deliver a working storage engine, so will continue and do so and prolly release MARIA in 6 months.

All of the MARIA programmers were required to read Jim Gray’s textbook Transaction Processing: Concepts and Techniques on Amazon.com, and each have said they understand it.

I got the impression that MARIA should end up with a cleaner codebase than Innodb.

Discussed table-level checksums for replication checking, which he is planning to do, might be an option. Still time to decide whether to use CRC32 or another algorithm.

One of his programmers gave a demo of yanking power on MARIA-current on his notebook computer, though I didn’t notice what the outcome was.

He said that ALTER TABLE optimizations are planned, including instantly dropping an index without copying the table, though add would still require copying the table.

Another conversation could be paraphrased as, “After MARIA, he will work with the Sun team that fixed Postgresql threads to fix various thread scaling problems in MySQL.”

Sphinx Search BOF

Hosted by Peter Zaitsev and Andrew Aksyonoff, the author.

Percona has used Sphinx Search in a few projects for web forum searching for the past few years. Separate index server is recommended.

About a dozen people talked about full-text search requirements and experience with Sphinx. One guy was spending a lot of money on encad(?) and didn’t want to spend more on a bigger license later.

Wednesday

MySQL Performance Under a Microscope: The Tobias and Jay Show, Tobias Asplund (MySQL), Jay Pipes (MySQL)

Presented slides on the performance of various workloads.

The MySQL Query Cache, Baron Schwartz (Percona Inc.)

Excellent in-depth discussion of the query cache.

Grazr: Lessons Learned Building a Web 2.0 Application Using MySQL, Patrick Galbraith (Grazr Inc.), Michael Kowalchik (Grazr Corporation)

Deadlocks, Wait Timeouts, and Other Transaction Issues, Jess Balint (MySQL)

Thursday

DTrace and MySQL, Ben Rockwood (Joyent Inc)

Good talk on using DTrace specifically with MySQL, mainly query debugging.

Scaling with MySQL using Materialized Views and a Shared Everything architecture, Moshe Shadmon (ScaleDB)

Listed 3 ways to do materialized views, but dwelt on their ScaleDB cluster product mostly.

High Availability MySQL with DRBD and Heartbeat: MTV Japan Mobile Services, Patrick Bolduan (MTV Networks Japan KK), Yoshinori Matsunobu (MySQL)

Talked about setting up HA using heartbeat, pingd and DRBD on a mostly-reads 5 GB CMS db. Used Enterpeise MySQL distro and support. No replication involved, likes mysqldump. Happy with MySQL 5.0 and Unicode with Japanese. Cute slides with Japanese maru symbols, etc. DBRD staff were on-hand to help with more difficult questions. Said LVM can be used under or over DRBD.

The Science and Fiction of Petascale Analytics, Jacek Becla (SLAC)

Talked about petabyte and exabyte DW for physics and astro programs. Contrasted science and industry PB databases (Google, MSN and Yahoo! likely each have 100 PB databases, but don’t disclose the size.) 5 years to plan DW for next experiment.

Spent 10 minutes with Rohit Nadhani and his programmer from Webyog looking at their MonYOG 2.01 version. Provided UI feedback for database operations use based on several months usage. Looking good.

Talked with Patrick Bolduan, somebody from DRBD and a NY Times IT guy who uses EC2 in the lounge. Apparently S3 can lose up to 10% of insert requests. Everybody is looking forward to when Amazon EC2 and S3 have competition for both pricing and reliability improvements.

Conference Evaluation and Recommendations

I go to a lot of conferences as a paying attendee, so I usually provide feedback to the organizers to help them improve the experience.

I mentioned to Jay that overall the conference was fine:

  • Talks were good, maybe less technical than previous years. Jay said Sun wanted more talks for novices since 2007 was too hardcore for some new attendees, but O’Reilly did not want to ghettoize newbs with a single track and room. For sure 2007 had too many sharding talks, mostly with users stuck on 4.0. There’s always the hallway track with developers anyway.
  • Food was ok but not great, although O’Reilly, the conference organizer, slipped in sandwiches for lunch on tutorial day. (SCCC is in an isolated location, so food is a big deal.) Still way better than OSCON in recent years. I guess the original MySQL conferences at the DoubleTree spoiled me.
  • Some of the vendors got too salesy in their presentations, but it’s hard to crack down on sponsors. Pentaho and ScaleDB come to mind.
  • conference still provides full access to MySQL managers and key programmers, in the best Open Source tradition
  • Still need a big-iron room with functioning demo SANs and HA setups, as I’ve suggested for a few years.

MySQL Conference 2008 Presentations
TechCrunch: Rackspace Offers Cloud Computing with Mosso
cnet.com: Is cloud computing more than just smoke?

YouTube BGP Disappearance

Monday, February 25th, 2008

YouTube disappeared from the Internet on Feb 24 after a Pakistani ISP tried to block it by advertising a more specific route in BGP.

BGP is used mainly for routing, but you can also do geographical load-balancing with it.

It’s unlikely that secure BGP will take off anytime soon because of the overhead involved.

But companies may consider advertising only /24 blocks to prevent others from being able to advertise more specific routes. The downside to that is bigger routing tables and more memory required to hold them for everybody.

cnet.com: How Pakistan knocked YouTube offline (and how to make sure it never happens again)
arstechnica.com: Insecure routing redirects YouTube to Pakistan
NANOG thread: YouTube IP Hijacking
Internet-Wide Catastrophe—Last Year

Free Keynote Internet Health Report
Renesys Blog
BG4.AS
BGPlay
wikipedia: BGP, Autonomous System
cnet.com: YouTube disappears from the screen temporarily

Ruby Test Automation Tools

Thursday, February 14th, 2008

I was talking to a QA Manager friend of mine today, and got his opinion on test automation.

I wasn’t surprised that his opinion was similar to mine, but surprised they were basically identical.

Test automation is:

  1. ok for APIs, less so for UIs
  2. no panacea or replacement for manual testing
  3. real programming
  4. too painful if your app is not prepared for testing
  5. inefficient on a moving target.

When I need to do automated web testing, I usually use perl HTTP::WebTest.

He’s been looking around at some Ruby tools, mainly Selenium. RSpec is another. (OpenQA has a bunch of tools on their site.)

AtomicObject is a consulting company with a very interesting webpage on their Ruby testing philosophy and tools.

Another Ruby environment that looks interesting is Heroku.

iSkoot Skype Client for Smartphones

Thursday, February 14th, 2008

iSkoot has a Skype voice and chat client for Smartphones.

I installed it on my BlackBerry 8700g and the chat client seems to work ok. No emoticons though.


iSkoot Skype Client for BlackBerry
iSkoot Skype Client for BlackBerry

I’ve been using Shapeservices’ $25 IM+ for Skype client for about 6 months on my Blackberry 8700g for work. Works fine.

Nice to have a free alternative though.

ShapeServices also markets an iPhone Safari-based Skype client now.


IM+ Skype for iPhone
ShapeServices Skype Safari Client for BlackBerry

Electronic Banking Tokens in Indonesia

Friday, January 18th, 2008

It’s interesting to see how a developing nation like Indonesia does online banking.

Less than 1% of the population has a computer at home, and even fewer have a home Internet connection. Instead more people have basic cell phones for sending SMS messages primarily.

Internet banking is desirable for office workers in the capital of Jakarta to avoid traffic jams and check payments.

What’s different about Internet banking in Indonesia than the USA is that in Indonesia, true 2-factor authentication is used: something you know (PIN) and something you have (a hardware access token.)

Unfortunately in the US, we have 1.5-factor authentication for online banking: something you know (PIN) and something else you know (SiteImage, etc.) Good luck getting an access token from most US banks.

Why is Indonesia more serious about authentication? I think it has to do with a variety of factors. In Asia, generally companies don’t have refund policies, so the initial transaction has to be correct.

Also, Indonesia is a hotbed of online fraud, which pays far better than the national min. wage of $90/month. And computer anti-virus and firewall updates are sporadic due to lack of licenses and the poor Internet connectivity from Indonesia to outside.

BCA, Mandiri, and Niaga banks all require access tokens. Each bank has chosen a different color and shape.

KeyBCA is a blue triangle manufactured by Vasco.


KeyBCA PIN Entry
KeyBCA Balance Transfer

Next Gen Credit Card: Kartu Debit dengan “KeyBCA” di dalamnya

Cause of Subprime Mortgage Crisis: Americans can’t do Math

Friday, December 14th, 2007

I’ve been following the subprime mortgage crisis fairly closely and think I know the cause … Americans just can’t do math.

Half of the ARM borrowers were qualified for regular (and less expensive) mortgage terms - but didn’t ask for the better terms.

Many new mortgage-holders used their house as a private ATM machine: as soon as the price inflated, they signed for Home Equity Lines of Credit (HELOCs) and went on $50,000 shopping sprees.

Readily-available mortgage insurance ensured that lenders didn’t care who was on the other end of the pen. Now that jumbo loan insurance is limited to $417,000, lending standards have tightened dramatically.

MBA Data Confirms Popularity of ARMs and Interest-Only Products Despite Overall Mortgage Origination Decline
money.cnn.com: Help! Our kids are driving us broke
cnn.com: Wealthy may be next in line in home crisis