Wednesday, September 28, 2005
Tuesday, September 27, 2005
Check out two cool features/applications of google.
1. GoogleEarth - http://earth.google.com/
2. Personalised Google Home- http://www.google.com/ig
I tried out google personalised home today and was very impressed. Some cool web parts you can have are,
- Gmail
- News Sites
- Search History
- Quote of the Day
And you can also search for many more and add them and drag and drop them to change the UI.
Here's a snapshot of my current home.
Friday, September 23, 2005
Postgres Performance Tuning
Since postgres performance caught up my attention during the FOSSL conference I decided to do some research on improving postgres performance and here’s what I found.
There are many things one can do to upgrade the performance of a postgres DB. As usual many of these are not straight forward tips that when completed will give u a highly performing DB. But most of these tips have to be applied after careful study on your DB and understanding the compromises.
Some useful yet simple things one could do to upgrade performance are,
1. Creating Indexes
2. EXPLAIN
3. VACUUM / ANALYZE
4. Configuration Parameters
Creating Indexes
Any group of fields which is searched or sorted more frequently than being updated or inserted can have an index.
CREATE INDEX indexname ON table (fields);
EXPLAIN
Postgres devices a query plan for each query. If the plan is matching the properties of actual data then the performance is high. EXPLAIN command lets you check the plan for any query. With EXPLAIN you can create proper indexes or change queries for optimum performance.
Eg:vault_db=# explain select * from rcp_user_device;
QUERY PLAN -------------------------------------------------------------------
Seq Scan on rcp_user_device (cost=0.00..4.77 rows=177 width=172)(1 row)
The results explain provides are,
1. cost – Estimated cost of the query in resource units
2. rows – Estimated number of rows returned
3. width – Number of characters per row
explain analyze will give you a chance to compare the actual effort of a query with the estimated effort. vault_db=# explain analyze select * from rcp_user where userid='user';
QUERY PLAN -------------------------------------------------------
Index Scan using rcp_user_pkey on rcp_user (cost=0.00..5.99 rows=2 width=110)
(actual time=110.118..110.127 rows=1 loops=1) Index Cond: (userid = 'user'::text) Total runtime: 110.260 ms
VACUUM/ANALYZE
This is probably the easisest method to get a performance boost to your DB. As data is added to the DB files become dirty and vacuum. It performs two major operations,
Remove any leftover data from rollbacks and other processes that can leave temporary data
Analyze activity in the database to assist PostgreSQL in designing efficient query plans
There are three levels of VACUUM which can be run:
1. VACUUM - This is the fastest puts the least load on the server. The primary objective of this level is to market unused spaces on the disk file as being available for reuse.
2. VACUUM ANALYZE - In addition to what is done during a standard VACUUM, a VACUUM ANALYZE updates query planner statistics
3. VACUUM FULL - This takes a standard VACUUM around to the next logical step. VACUUM by itself only marks unused records on disk as reusable. A Postgres VACUUM FULL actually eliminates the unused records from the files on disk shrinking the file. Unlike the first two versions of this command, a VACUUM FULL will lock the tables while it is run providing an interruption to anyone trying to use the tables. On large tables where a lot of records have changed, this could take a long time creating a long period of essentally database downtime.
VACUUM ANALYZE can be combined with VACUUM FULL as VACUUM FULL ANALYZE to perform all three functions at once.
The bestpractice adviced by postgres admins is to run VACUUM and VACUUM ANALYZE automatically on a regular basis. May be on a nightly basis when the db is not very active.
Eg :vacuum analyze verbose table_name
Let’s discuss on Configuration Parameter related tuning on another blog.
Saturday, September 10, 2005
1. MySQL - David Axmark, Co Founder of MySQL
2. Postgres - Bruce Momjian, Lead contributer to PostgreSQL
The postgres session was my favourite out of the two and it prompted me to put up this entry so I could prompt a few more people to read a bit more on postgres internals and in turn DBMS fundemantals.
Bruce with his catchy presentation style and very humble explanations was soon very much appreciated by all of us and explained how postgres internally handles SQL queries (http://candle.pha.pa.us/main/writings/pgsql/internalpics.pdf).
This was actually much more generic and I suppose one could get a general understanding of how a DBMS handles SQL queries fires against it.
The basic steps of any decent DBMS of picking up a SQL query and executing is,
1. Parsing and Tokenizing (Postgres uses flex & bison..surprise J ) the query
2. Forming a easy to work with data structure from the tokens
3. Optimizing (This was the bulk of the discussion and was very interesting)
4. Execution
Out of the many small discussion threads we had I’d like to mention one here which was about preparing sql statements. This basically means that you inform to the DB engine about the plan of execution (which is normally derived after doing an optimization step) with the query it self. So the whole overhead of optimizing is vanished since the plan of the query is saved in the DB for you. This can look like a very clever thing to do but could prove to be your downfall pretty easily as shown by Bruce in the session.
The basic idea is that if your query is gonna execute 100 and 1000 times it’s wise to prepare it. Also it should be a pretty complex query where the actual time taken for an optimization (in case of no prepared plan) is comparable with the execution time. Even here if your same query which runs 1000 times a day runs with different sets of parameter values then you could get screwed again.
Eg : SELECT name from T1 where age=?
Assuming that the age column actually contains the age of human employees of your company ;-), there could be lots of values between say…25-40 (Assume 95%). So if you prepared the plan with a value in this range your optimizer will most probably go for a sequential scan. Now suppose your CEO comes in and issues the same SQL with the age 100 to find out who he should fire next. And since the prepared plan is tuned for the 95% of values this instance of the query will perform badly and the CEO decides to fire the poor old DB folkJ.
(A good thread I came across –
http://www.mail-archive.com/pgsql-general@postgresql.org/msg65562.html )
All in old what I understood is that the most difficult part of this query processing is actually the optimizing part. Bruce is actually pretty confident that postgres has the best optimizing engine even compared to commercial products (SECRETS: mySQL optimizer is a single C file). He also pointed out one reason why this is so. In Oracle you have “Oracle Hints” which actually provides users guidance on tuning the DB to a specific kind of query. This is like loosing a very good opportunity for the DB vendor to improve his DB engine. On the other hand Postgres does not have this capability and thus the user is more or less pushed towards informing postgres on a low performing query. This give the postgres community to find yet another weapon in the never ending optimization war. This actually sounds very sensible given the turn over time of patched up version of open source postgres (Oracle can’t do that). Also the fact that a user is not actually desperate for a answer within minutes of finding out that his query is slow. She could actually wait for a couple of days for an answer while her system continues to operate correctly although slowlyJ.
(Check out Bruce Momjian web side at http://candle.pha.pa.us/ for more cool stuff. His writings section has a lot more resources on postgres)
We went to the “Uda-Walawe” (http://www.answers.com/topic/uda-walawe-national-park) wild reserve which is one of the largest National Parks in Sri Lanka located at Central-South area of the country. This is also one of the largest habitats for Elephants in Sri Lanka. We also visited a very old palace (built in 1700) and ‘Hot-Water’ wells which was fun.
I will post some more blogs on this trip in the next few weeks.
Here are couple of pics from the trip.

