Mon, 22 Nov 2004

Selling Apple

I sold half my Apple stock at $61.33 today. It's pretty damn expensive at this level.

business | Permanent Link

Thu, 18 Nov 2004

Google, Say It Isn't So Already

Some web hosting customers, especially those with dubious business plans who have hired SEO witch doctors, request that every web site be placed on a separate IP address. The theory is that Google is likely to discount the links between sites on the same IP address because such sites may be trying to artificially increase their PageRank. It would be silly for Google to do this since there could be hundreds or thousands of unrelated sites that share an IP address, simply because they are customers of the same web hosting company. Nonetheless, this seems to be a widely-held belief.

This is frustrating to those in the hosting business because IP(v4) addresses are a finite resource, and here in North America, requests for new IP addresses for IP-based web hosting must include a technical justification.

Today, Sean emailed Google to ask for clarification. I hope that Google will publicly state that using multiple IP addresses cannot be used to inflate PageRank.

Update (Thu, 09 Dec 2004): Google responds by saying, go ask someone else to speculate.

Date: Thu, 09 Dec 2004 14:33:35 -0800
From: help@google.com
Subject: Re: [#17061378] Regarding the need for a static IP Address

Hi Sean,

Thank you for writing to us. Please note that we don't comment on
webmaster techniques or the details of our search technology beyond what
appears on our site.

We've dedicated an entire section of our site to answering the most common
questions from those who maintain and/or promote websites. You'll find all
of our publicly available information posted at
http://www.google.com/webmasters/index.html

Besides this section of our site, we've created a newsgroup discussion
forum for passionate Google users. At http://groups.google.com in the
http://groups.google.com/groups?q=google.public.support.general group,
many webmasters and Google users share their questions and expertise.

We recommend performing an advanced search on this group if you feel your
question is particularly challenging and you've been unable to find an
answer on our site. To do so, go to
http://www.google.com/advanced_group_search?hl=en and enter your search
terms in one of the 'Find messages' fields at the top of the page. Type
'google.public.support.general' in the 'Newsgroup' search field, and click
'Google Search.' If you don't find an answer to your question, you can
always post your question to the group to see if other newsgroup users
have helpful advice. Please be aware that this content isn't posted by
Google, and we cannot verify its accuracy.

Regards,
The Google Team

tech | Permanent Link

Apache Virtual Host Bandwidth Monitoring

Also, a good example of the READ-COMMITTED isolation level

mod_accounting is a simple Apache module for recording bandwidth usage on a per-virtual host basis. It can write to either a MySQL or PostgreSQL database. You just add a couple directives to httpd.conf to tell mod_accounting which database to write to, the query to run, and how often to write to the database.
Here's the configuration I use.

<IfModule mod_accounting.c>
AccountingQueryFmt "   \
  INSERT INTO          \
    vhost_accounting   \
  (                    \
    bytes_in,          \
    bytes_out,         \
    virtual_host,      \
    host               \
  )                    \
  values               \
  (                    \
    %r,                \
    %s,                \
    LOWER('%h'),       \
    'host.example.com' \
  );"
AccountingTimedUpdates 300
AccountingDatabase vhost_accounting
AccountingDatabaseDriver mysql
AccountingDBHost stats.example.com 3306
AccountingLoginInfo vhost_accounting XXXXXXXX
</IfModule>

And the corresponding table.

CREATE TABLE `vhost_accounting` (
  `virtual_host` varchar(128) NOT NULL default '',
  `bytes_in` int(20) unsigned NOT NULL default '0',
  `bytes_out` int(20) unsigned NOT NULL default '0',
  `timestamp` timestamp(14) NOT NULL,
  `host` varchar(128) NOT NULL default '',
  `id` bigint(20) unsigned NOT NULL auto_increment,
  PRIMARY KEY  (`id`)
) TYPE=InnoDB

If you have tables of hosts and virtual hosts, you'll probably want to normalize the database and modify AccountingQueryFmt to use the foreign key to the vhost. For example,
AccountingQueryFmt "INSERT INTO vhost_accounting (bytes_in, bytes_out, virtual_host_id) SELECT %r, %s, id FROM vhost WHERE virtual_host_name = LOWER('%h');"

There are a couple issues that make deploying mod_accounting to multiple machines a bit tricky. Each apache process does its own logging, and thus, maintains its own connection to the database server. If you have hundreds of Apache processes and multiple servers, you'll quickly have thousands of open connections to the database. To reduce the number of concurrent connections, I made a one-line change to mod_accounting to close the connection to MySQL after writing to the database so a new connection is used each time data is written to the database.

The next problem is simply the large amount of data that is collected. It quickly grows to the point where generating reports in real-time is too slow. To solve this problem, I created a summary table, vhost_accounting_summary, to consolidate the data produced by each Apache process. I also decided that hourly statistics were sufficient for my needs.

The records in the vhost_accounting table are removed once the data has been inserted into vhost_accounting_summary. One other table, vhost_accounting_timestamp, is used to keep track of when the summary table was last updated and to ensure that data from vhost_accounting does not accidentally get written to vhost_accounting_summary multiple times. (More details below.)

CREATE TABLE `vhost_accounting_summary` (
  `virtual_host` varchar(128) NOT NULL default '',
  `bytes_in` int(20) unsigned NOT NULL default '0',
  `bytes_out` int(20) unsigned NOT NULL default '0',
  `timestamp` timestamp(14) NOT NULL,
  `host` varchar(128) NOT NULL default '',
  KEY `virtual_host` (`virtual_host`),
  KEY `idx_timestamp_host` (`timestamp`,`host`)
) TYPE=InnoDB

CREATE TABLE `vhost_accounting_timestamp` (
  `timestamp` timestamp(14) NOT NULL
) TYPE=InnoDB

And here's the SQL to update the vhost_accounting_summary table.

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN;
SELECT timestamp FROM vhost_accounting_timestamp FOR UPDATE;
SELECT @max_id:= max(id) FROM vhost_accounting;
INSERT INTO
        vhost_accounting_summary
SELECT
        virtual_host,
        SUM(bytes_in),
        SUM(bytes_out),
        DATE_FORMAT(timestamp, '%Y-%m-%d %H:00:00') AS dayhour,
        host
FROM
        vhost_accounting
WHERE
        id < @max_id
GROUP BY
        virtual_host,
        dayhour;
DELETE FROM vhost_accounting WHERE id < @max_id;
UPDATE vhost_accounting_timestamp SET timestamp = now();
COMMIT;

There are a couple important points to make about the SQL above. We want to make sure that mod_accounting never has to wait to write to the vhost_accounting because of a lock. In addition, we want to ensure that, if multiple copies of the update script get run concurrently, it does not cause the same data to be written to vhost_accounting_summary multiple times.

Mod_accounting does its work while processing a request. If mod_accounting is waiting for a lock, the Apache process will not finish processing the current request until the lock is removed or a lock timeout occurs. This can quickly cause all of the Apache children to be tied up and prevent new requests from being processed.

To prevent a lock from being set, we only act on records with a primary key one less than the maximum. This prevents a next-key lock from being set when we delete the records we have just processed.

To prevent multiple instances of the script from causing redundant data to be stored, we do a locking read of the timestamp by using FOR UPDATE. This lock only affects this summary script since mod_accounting does not use the vhost_accounting_timestamp table. Furthermore, we must use the READ-COMMITTED isolation level to guarantee that once the script receives the lock, it doesn't read any rows from vhost_accounting that have already been deleted.

And finally, we're ready to generate some reports…

SELECT
  SUM(bytes_in) AS bytes_in,
  SUM(bytes_out) AS bytes_out,
  SUM(bytes_in + bytes_out) AS total,
  virtual_host,
  host
FROM
  vhost_accounting_summary
WHERE
  timestamp >= '2004-11-18 00:00:00' AND
  timestamp <= '2004-11-18 23:59:59'
GROUP BY
  virtual_host
ORDER BY
   total DESC

tech | Permanent Link

Tue, 16 Nov 2004

Exchange Replacements

I spent a few hours today researching Exchange replacements. These are products that are designed to replace Microsoft Exchange on the server, but still allow use of Outlook as a client, including the much-beloved calendaring features.
Here's what I came up with.

Date: Tue, 16 Nov 2004 16:24:55 -0800
From: "Christian G. Warden" <cwarden@zerolag.com>
Subject: Exchange replacement analysis - phase 1

There are a handful of products that claim to be Exchange replacements.
They all work in the same manner, using a custom MAPI connector, which
is basically a plug-in for Outlook, to access the server.  Each version
of Outlook has different features so most of these products only work
with certain versions of Outlook.  Because I'm not very familiar with
Outlook, it is difficult for me to tell if these products fully support
the features of Outlook.  We'll need to setup a test environment to
fully evaluate any of these products.

OpenGroupware[1]
This was previously a closed source server that was open sourced a
couple years ago.  I evaluated it briefly a year or so ago, and it
seemed stable and featureful, but had a bit of a clunky web interface.
It looks like development is pretty active, though.  I haven't evaluated
the Outlook connector, ZideLook[2], which is a commercial product which
costs about $50 per client.  There is no demo of ZideLook available.
ZideLook communicates with OpenGroupware using WebDAV.
OpenGroupware just handles the groupware functionality and integrates
with third-party IMAP servers.

1. http://www.opengroupware.org/en/index.html
2. http://esd.element5.com/product.html?cart=1&productid=517934&languageid=1&nolselection=1¤cies=EUR

SUSE LINUX Openexchange Server[3]
This is a commercial product.  It is distributed a full linux
distribution and cannot be installed on an existing Linux system. (Such
an installation would not be supported at least.)  Pricing is unclear.
The product is supposed to be available for purchase online at
novell.com, but isn't, perhaps because they are currently integrating
the product with Novell's Groupwise.
There is an online demo[4] and the Outlook connector is available for
download[5].  Openexchange is made up of a number of open source
components and comFire, the groupware component, which was licensed from
a company called Netline.  comFire has recently been open sourced by
Netline as Open-Xchange[6], but the Outlook connector is not licensed for
use with Open-Xchange.  The Outlook connector communicates with the
server using WebDAV.  There is a good article about Openexchange[7].

3. http://www.suse.com/us/business/products/openexchange/index.html
4. http://www.suse.com/us/business/products/openexchange/demo.html
5. http://www.suse.com/us/business/products/openexchange/download.html
6. http://mirror.open-xchange.org/ox/EN/product/
7. http://www.linux-magazine.com/issue/48/Suse_Linux_Openexchange_41.pdf

Bynari Insight Server[8] and Insight Connector[9]
I believe Bynari was the first company with an "Exchange replacement on
Linux" product.  Their Outlook connector allows calendars and address
books on an IMAP server.  It claims to require the Insight Server,
though Insight Server uses Cyrus as the IMAP server, so it may work with
a normal Cyrus server.  Insight Server is composed of a number of open
source products such as Postfix, OpenLDAP, and Apache.  Bynari seems to
think most of the value is in the Connector since a 1000 user license
for Insight Connector is $17,000, and a 1000 user license for a bundled
Insight Server and Insight Connector is $18,000.  (Insight Server
without the Connector is also sold for $1,000.)  A demo is available.

8. http://www.bynari.net/index.php?id=1169
9. http://www.bynari.net/index.php?id=7

BILL Workgroup Server[10]/Exchange4Linux[11]
Documentation is kind of spotty on this one.  I don't think it's worth
evaluating except as a last resort.

10. http://www.billworkgroup.org/billworkgroup/home
11. http://www.exchange4linux.com/exchange4linux/Home

None of the Above (IMAP/LDAP/SMTP/WebDAV or FTP)
Depending on the customer's needs, perhaps Outlook in "Internet Mail
Mode" will be sufficient.  IMAP supports shared folders, but I don't
know if it supports setting ACLs.  Outlook also supports LDAP for
address books, but I don't know if supports updating the directory.
Outlook can send meeting requests and responses over email and publish
free/busy time over FTP (and, I think, either WebDAV or HTTP PUT), but I
don't know if this would meet the customer's needs.


I recommend trying out Openexchange first as it seems to be the most
open and widely deployed.

Christian

Comments from anyone who has deployed one of these products for use with Outlook would be appreciated.

tech » mail | Permanent Link

Sat, 13 Nov 2004

Blogs Need Pictures(?)

Russell says that blogs need pictures of their authors. I don't feel strongly one way or the other about it. I pretty much only read blogs in straw, so I don't see authors' mugs very often. For those that care, though, I've added my picture. The right side was pretty barren anyway.

culture | Permanent Link

Google Ads are Hard

We haven't had much luck with Google Ads for Postica so far. There's a lot of competition in the spam filtering market so it's hard to stand out among 10 ads for spam/virus filtering products and services on the same page.

It's frustrating having what I think is a great service, and not being able to reach potential customers. We're mostly focusing on selling through partners now, but we haven't completely given up on Google Ads yet. Here's our latest.

business » marketing » advertising | Permanent Link

From Techie to Business Owner

I've figured out why I find sales so unpleasant. Being a business owner requires a much different mind-set than that of a typical techie.

I've been working as a sys admin and web developer for the past seven years. As such, I only have to sell myself to an employer or client once, and typically, over a single one or two hour interview period. Having to sell a product or service to a customer who hasn't already "hired" you is a much different proposition.

Failure

As a techie, I don't often have to deal with failure. When working on a new project or fixing an existing system that is broken, I just keep working on it until it's done. Rarely must I concede defeat and give up hope of finding a solution.

Failure is a normal part of sales, though. Salespeople cannot always just keep working on a sales prospect until a deal is closed. This would likely result in restraining orders.

With Postica, we decided that Greg would focus on sales, while I would handle technical issues. Nonetheless, I participate in some sales calls, and I become frustrated when sales don't increase as quickly as I would like.

Repetition

Direct sales requires a lot of repetition. You generally have to give the same pitch over and over to each new potential customer. This conflicts with my natural desire to automate repetitive behavior. When I come across a task that I have to do repeatedly, I usually write a script or subroutine to automate the process.

Part of marketing is to automate the sales message, but when bootstrapping a business with limited capital, much of your marketing message must be delivered personally to potential customers.

Mind-set

I need to find some good books on sales that explains how to get into the correct mind-set.

Michael Cage discusses the need to train non-salespeople for sales. In order to do this, the non-salespeople must learn to think like salespeople.

A couple years ago, I really got into blackjack. To be successful at blackjack requires basically becoming a computer. You must keep track of the count, and bet and play exactly as the rules (based on calculated odds) that you've memorized tell you that must. You must not become emotional after losing a lot of money when doubling after splitting (though you may want to feign being upset for the pit boss's benefit). What happens during a single hand is irrelevant. Your goal is to grind out a profit by having a slight advantage over the house based on the rules and your ability to vary play based on the count.

A fantastic book on blackjack is Million Dollar Blackjack. It describes how Ken Uston and his team won $4,000,000, persevering through lawsuits, pit bosses, the gambling commission, and malfunctioning electronic equipment (they used some computerized counting gear). Business tales are often similar. For example, Joe Kraus's story about how Excite got the deal to be featured in Netscape Navigator is a story about persistence in the face of rejection.

Is there a way to reconcile the techie mind-set with a salesperson's world?

business | Permanent Link

The state is that great fiction by which everyone tries to live at the expense of everyone else. - Frederic Bastiat