Trivore Corp. logo

Welcome to Trivore VMware ESX Server v1.5.x hints 'n' tips
@ http://trivore.com/vmware/

This information was last updated on 15-Jun-2003.

This page is designed to aid in designing, setting up, and operating the VMware ESX Server platform. Most of this information has been gathered both during working with the product on different production environments, and during the several training sessions delivered on the product. This information is current as of version 1.5.2 of VMware ESX Server.

As this information has been gathered mostly for Trivore's VMware ESX Server class students, the representation might not always be the best possible. Also, some information is pretty purely GNU/Linux oriented. This is intentional, as many technical people are still pretty new to the GNU/Linux stuff running on the Console OS.

These hints and tips do not replace neither the excellent VMware ESX Server manual, nor a decent hands on GNU/Linux knowledge, nor a decent, hands-on-oriented VMware ESX Server classroom education. Any Unix knowledge makes your life a lot easier.

This document is frozen as of now for v1.5.x series. For later information, please look for the documents for later ESX Server versions. If there is something you might want to see here, or you would otherwise send me feedback, please email us. The maintainer of this information is Kari Mattsson of Trivore Corp.

Planning and documenting the system

This chapter focuses on the planning phase of system setup. It emphasises and reasons the importance of the documentation. More planning oriented material is also in the Storage chapter.

The files making up a VM (virtual machine)

Generally about 4 files make a VM: "xxx.dsk", "xxx.cfg", "xxx*.log", and "nvram". There could be more than one .dsk file associated to a VM (virtual machine). All files, but the .dsk file(s) are always in the same (sub)directory.

The older .log files are always deleteable at will. The latest .log file can be deleted when the VM is powered off.

The nvram file might occasionally corrupt for some reason. If that happens, power off the VM and just delete the file. It will be recreated next time the VM powers up. If any changes were made to the nvram via VM's BIOS setup, do not forget to remake those changes.

In addition to the .dsk file, a .dsk.REDO file will exist in the same directory, if non-persistent, undoable, or append disk mode is active for that virtual SCSI disk. The .dsk.REDO file can grow up to several gigabytes. How large it will grow, depends only on what file/disk operations are done on the disk.

It is a good idea to place a brief descriptive "xxx.txt" file next to the "xxx.cfg" file to document what this VM is, what is its business purpose, what is the OS version and level, what are the main installed applications, and who is/are responsible for the VM. If you absolutely do not want the .txt file, you could place that same information to the .cfg's comment lines. Is the information safe there, is untested.

Naming a VM, and its files is important. It will be crucial once there are tens (dozens for those of you still using legacy systems :-) of VMs. It is a recommended practice to:

A powered on VM can be suspended, much like a laptop. During suspending, a suspend file will be created. It has a filename extension .vmss, and by default it is created to the .cfg file directory. It might be a better idea to direct it to the same directory where the VM's operating system .dsk file is.

The .vmss file will be a few megabytes larger than the maximum RAM memory allocated to the VM.

Planning and documenting

Planning is important with ESX. It is especially important with the storage allocated to VMs. Some preliminary questions to answer are: What are the VMs to be installed on ESX Server? What operating systems will be installed? Are they file/print/database/application servers, or what? What kind of storage allocation will be needed? How much storage is initially reserved to a each VM? Are VMs executed on more than one ESX Server at the same time? What are the CPU requirements per VM? What are the network I/O requirements per VM? What are the disk I/O requirements per VM? What are the memory requirements per VM? How much storage is required for suspended VMs? Will non-persistent, undoable or append disk mode be used?

Documenting is another extremely important issue with ESX Server, as these systems easily become extremely complex. Mainframe and Unix style management attitude is required. Document everything carefully. It will save your day often. The few line .txt file next to .cfg file is especially nice. An example skeleton .txt file follows:

Name of the VM (virtual machine):  [vm###]
Last update of this file........:  [dd-Mmm-yyyy / updater's name]
Business purpose of the VM......:  [short description]
Responsible person(s) for the VM:  [name, contact (email and/or telephone)]
OS name and version in the VM...:  [OS name, update level]
Installed core application(s)...:  [app1, app2]

Storage: partitioning, filesystems, directories, files, etc.

This chapter covers the issues on storage. Some of this material is very planning oriented.

General information

Maximum number of partitions (which can contain data) per SCSI disk (=logical drive on RAID systems) in Linux kernel 2.4.x based systems is 14. The Console OS is based on kernel version 2.4.9. Of these max 15 partitions 3 are primary (1-3), 1 is extended (4), and 11 are logical (5-15). Extended partitions never contain any actual data. It merely acts as a container for the logical partitions.

This 14 data partition (partitions 1-3, and 5-15) limit could force you to create smaller logical drives than you wanted.

It is a nice practice to used primary partitions for non-vmfs filesystems, and logical partitions for vmfs filesystems. Nothing forces you to do that, it is just a clear choice.

SCSI disks are named as /dev/sda (first), /dev/sdb (second), /dev/sdc (third). The first disk is always the boot disk and internal to the server. The other disks are usually on an external disk enclosure.

If you are using CD/DVD ISO images, or floppy images, create a partition for them to the Console OS. Images are very handy for OS and application installations. They are much more reliable and convenient than the actual CDs. Example commands on how to create these images are given later in this document.

On IBM servers, the 50 MB System Partition accessible with [Alt-F1] is partition /dev/sda4. VMware ESX Server recognises this partition as "Old VMware Core Dump Partition". Usually you do not want to touch this partiton, so leave it alone. Non-IBM systems might also have this kind of service partition.

VMFS (vmfs, virtual machine file system)

Partitions formatted as vmfs do not support subdirectories. It is one thing given up for the fast speed of vmfs.

vmfs partitions have 3 possible accessibility settings:

.dsk files (seen as physical SCSI disks in the VM)

The virtual SCSI disks (.dsk files) have four different modes:

Last two modes are invaluable when upgrading or testing the software in a VM.
Strategy for placing all the files - VMware ESX Server files and the VM files

It is wise to install VMware ESX Server to the local/internal disks (single RAID-1 SCSI logical drive).

Place all the VM-related files (guest OSs) to external disks (logical drives) on fibre channel (or similar).

Standard: Partitioning the VMware ESX Server local/internal disk (/dev/sda)

This is where the VMware ESX Server and its Console OS will reside. Partitioning example for system with 36 GB local/internal drive:

/dev/sda4   primary   50       IBM xSeries System partition - might not exist
/dev/sda1   primary   2000     /  ext2 or ext3    Console OS root.
/dev/sda2   logical   1000     swap               Console OS swap. See note below.
/dev/sda3   extended  ~31000                      Logical partition container.
/dev/sda5   logical   50       vmcore             VMware Core Dump.
/dev/sda6   logical   30000   /vmfs/local  vmfs  See notes below.
Tuned: Partitioning the VMware ESX Server local/internal disk (/dev/sda)

This is really for experienced GNU/Linux administrators only. Use this partitioning scheme instead of the above one.

This fine-tuning makes it possible to make the Console OS even more secure. You do not want to apply this plan unless you are very familiar with Linux and the Console OS.

/dev/sda4   primary   50       IBM xSeries System partition - might not exist
/dev/sda1   primary   100      /boot        ext2  Console OS kernel is here.
/dev/sda2   extended  ~34000                      Logical partition container.
/dev/sda5   logical   500      /            ext2  Console OS root.
/dev/sda6   logical   500      /tmp         ext2  This is really not much used.
/dev/sda7   logical   1000     /var         ext2  All the local log files will be here.
/dev/sda8   logical   1500     /usr         ext2  All standard programs are here.
/dev/sda9   logical   100      /usr/local   ext2  All server local scripts/programs.
/dev/sda10  logical   300      /opt         ext2  Optional 3rd party programs.
/dev/sda11  logical   1000     swap               Console OS swap. See note below.
/dev/sda12  logical   50       vmcore             VMware Core Dump.
/dev/sda13  logical   25000    /vmfs/local  vmfs  See notes below.
Partitioning and otherwise handling the external disk space for VMs

There are two grand philosophies (schemes) with partitioning the external disk space where VMs reside. You can combine the schemes, if required. Both schemes are presented below:

Partitioning scheme A

This partitioning scheme needs more preliminary planning, but is more convenient for the administrators operating the server. It is also easier, if you have more than one ESX Server executing VMs on one external logical drive.

Partitioning example for system with 200 GB fibre channel logical drive; all formatted filesystems are of type vmfs and accesibility is set to either private (only single VMware ESX Server exists in the SAN) or public (also another VMware ESX Server exists in the SAN):

/dev/sdb1   extended  ~200000	volume name   example name: vmhba2:1:0:1
/dev/sdb5   logical     10000	vm001         vm001-os.dsk, 6000
/dev/sdb6   logical     30000	vm002         vm002-os.dsk, 4000 & vm002-data.dsk, 20000
/dev/sdb7   logical     10000	vm003         vm003-os.dsk, 7000
/dev/sdb8   logical     95000	vm004         vm004root.dsk,10000 & vm004home.dsk, 75000
Partitioning scheme B

This partitioning scheme allows easier growing of .dsk files and generally is more efficient on disk space usage. It is also a bit riskier. Something could happen to the vmfs partition where all the .dsk files are. The filesystem could corrupt or something.

Partitioning example for system with 200 GB fibre channel logical drive; all but the first formatted filesystems are of type vmfs and accesibility is set to either private (only single VMware ESX Server exists in the SAN) or public (also another VMware ESX Server exists in the SAN):

/dev/sdb1   primary       200   ext2 fs       mounted as /data for VM config and log files
/dev/sdb2   extended  ~200000	volume name   example name: vmhba2:1:0:2
/dev/sdb5   logical   ~160000	private1      all ESX Server private .dsk files
/dev/sdb6   logical     39000   shared1       Optional: all VM cluster shared .dsk files
Multiple ESX Servers connected to the same fibre channel infrastructure

On multiple VMware ESX Server configurations with shared fibre channel storage, the following (seemingly a little conflictiong) issues and restrictions apply:

Network: setup and configuration-related issues

Please consider the following issues:

VMware ESX Server tools and documentation

Most, if not all VMware Console OS utilities have a good man page. Enter "man xxx" to get the manual page for command "xxx". Example: man vmkfstools.

VMware ESX Server vmfs partition mount utilities are "mount-vmfs" and "umount-vmfs". The only parameter needed for these commands is the vmhda-name. By default private vmfs partitions are mounted under /vmfs/ directory. You have to mount public and shared partitions manually or you could mount them automatically at every boot by placing correct mount-vmfs lines to the end of /etc/rc.d/rc.local file on the Console OS.

General and very comprehensive vmfs filesystem utility is "vmkfstools".

ESX netcard utility for locating correct netcard among many netcards is "findnic".

All the tools are properly documented in the VMware ESX Server manual.

VMware ESX Server-related Linux files and directories

There are a couple of files and directories you should know about. The most important ones are listed below.

/etc/modules.conf
This file contains a list of devices in the system available to the Console OS. Usually the devices allocated solely to VMs, but physically existing on the system are also shown here in the commented-out ("#") lines. This is an important file for root and administrators.
/etc/fstab
This file defines the local and remote filesystems which are mounted at ESX Server boot.
/etc/rc.d/rc.local
This file is for server local customisations required at the server bootup. Potential additions to this file are public/shared vmfs mounts.
/etc/syslog.conf
This file configures what things are logged and where. Some examples are given below:
*.alert     /dev/tty12
This example logs all log items at level "alert" or higher to the virtual terminal at tty12. You can see this log by pressing [Alt]-[F12] on the console.
*.*     192.168.31.3
This example forwards everything (all syslog entries) to another (central) syslog server.
/etc/logrotate.conf
This is the main configuration file for log file rotation control daemon. It defines the defaults for log file rotation, log file compression, and time to keep the old log files. Processing the contents of /etc/logrotate.d/ directory is also defined here.
/etc/logrotate.d/
This directory contains instructions service by service for log file rotation, log file compression, and time to keep the old log files.
/etc/inittab
Here you can change the amount of virtual terminals available on the Console OS. Default is 6, but you can go up to 9.
/etc/bashrc
The system default $PS1 is defined here.It is a good idea to change "\W" to "\w" here to always see the full path while logged on the Console OS.
/etc/profile.d/colorls.sh
Command "ls" is aliased to "ls --colortty" here. Many admins don't like this colouring. You can comment-out ("#") this line.
/etc/init.d/
This directory contains the actual start-up scripts.
/etc/rc3.d/
This directory contains the K(ill) and S(tart) scripts for the default runlevel 3. The services starting with "S" are started on this runlevel, and the services Starting with "K" are killed, i.e. not started..
/var/log/
This directory contains all the log files. VMware's log files start with letters "vm". The general main log file is "messages".
/etc/ssh/
This directory contains all the SSH daemon configuration files, public and private keys. The defaults are both secure and flexible and rarely need any changing.
/etc/xinetd.conf
This is the main and defaults setting configuration file for xinet daemon. Processing the contents of /etc/xinetd.d/ directory is also defined here.
/etc/xinetd.d/
This directory contains instructions service by service for if and how to start the service. Of the services here, vmware-authd, wu-ftpd, and telnet are most interesting to us. Two of the most interesting parameter lines are "bind =" and "only_from =", which allows limiting service usage.
/etc/ntp.conf
This file configures the NTP daemon. Usable public NTP servers in Finland are ntp1.funet.fi, and ntp2.funet.fi.

VMware ESX Server-related Linux commands

There are a couple of commands you should familiarise yourself with. Most of them and some more are listed here. All of them have an online manual page, which you can read with the command "man command-name".

man
Prints the manual page for a command or a configuration file entered as a parameter to this command.
reboot
Does a nice reboot on the system. Does "Force Power Off" for the VMs.
halt
Does a nice halt on the system. Does "Force Power Off" for the VMs.
shutdown
Generic command for shutting down or rebooting the system.
fdisk
Command line disk partitioning program in Linux. It is powerful and has a very simple user interface.
fdisk /dev/sdb
On command line, starts fdisk against second available SCSI disk. "sda" is the first SCSI disk, "sdc" is the third SCSI disk etc. VMware ESX Server is installed on /dev/sda, and the external storage is /dev/sdb, and maybe some others too.
p
Fdisk subcommand, prints the current partition table on current disk.
d
Fdisk subcommand, deletes current partitions. Enter the partition number to delete. It is recommended to printout the current partition table before deleting anything.
n
Fdisk subcommand, creates a new partition. Select partition type (primary, extended, or logical). Almost always you should use the default starting cylinder. For size, enter "+NNNNNm", where NNNNN is the size in megabytes.
t
Fdisk subcommand, change partition type (id). By default fdisk creates ext2 type partitions. We might also want to use id "fb", the vmfs type, or some other type.
w
Fdisk subcommand, writes the current partition table to disk. If you don't get any errors, you don't have to reboot. If you get errors at this point, the new partition table is used only after next system boot.
mke2fs
This command formats a partition for ext2 filesystem. Example command would be "mke2fs /dev/sdb1".
mount|umount
These commands manually mount/umount CDs, floppies, local partitions, and remote directories to a selected local directory. The local (empty) directory must exist before the mount can succeed. Example mound command would be "mount /dev/sdb5 /data". Permanent mounting is done by editing the /etc/fstab file.
mkdir
Makes a directory.
rm
Removes files and/or directories.
mv
Moves files and/or directories.
kudzu
This is the RedHat's tool to detect and configure hardware: adding new and removing old. When you run kudzu, or system runs it at bootup, be careful. Kudzu might offer to remove hardware you have dedicated solely to the VMs. Know your hardware and configuration. It might be a good idea to refer to /etc/modules.conf file before running kudzu. A safe action in kudzu is "Do nothing". Select it when in doubt.
service
RedHat-made tool for daemon (service) starting/stopping/restarting/status querying. Syntax is "service servname [start|stop|restart|status]". Alternate to this command, which works with all Linuces is to call the script directly, like "/etc/init.d/sshd restart".
groupadd
Adds a new group to the Console OS. It is recommended to use one non-root group for VM admins and add operator/admin users there. To create that group, enter one the following commands:
groupadd -g 7777 vmadmins
groupadd -g 7777 vmadms01
useradd
Adds a new user to the Console OS with status disabled. To create new admins, enter one of the following commands:
useradd -g 7777 johndoe
useradd -g 7777 -c "Kari Mattsson" mattkar2
passwd
Changes the password for the userid entered as a parameter for the command. Only root can change the password for other users. They can only change their own password with command "passwd". Userids are disabled by default. They are activated by setting a password for them. An example command for root to set a password is the following command:
passwd johndoe
chown
Changes the owner user and optionally owner group of a directory, or a file. Optionally this command works recursively with parameter "-R". The assignment parameter is of type "user.group", or just "user". Some examples are given below:
chown -R root.operator /vmfs/* /data/*
chown root.esxadmin /vmfs/local/*
chown -R root /data/vmware
chown root.operator /etc/modules.conf
chgrp
Changes the owner group of a directory, or a file. Optionally this command works recursively with parameter "-R". Examples for "chown" apply here, but without the "root." part, as only the group is changed here.
chattr
Change special attribute of a directory, or a file. Immutable attribute is set with parameter "-i".
chmod
This command is the main command for changing file modes. Like chown, it can do things recursively with parameter "-R". Below are some example commands:
chmod -R 0775 /vmfs/* /data/*
chmod u=rwx,g=rwx,o=r /vmfs/freebsd462/*
chmod g+rwx /vmfs/vm007/*
chmod -R u+rwx,g=r,o-rwx /var/log/*
chmod u=rw,g=rw,o=r /etc/modules.conf
chmod 664 /etc/modules.conf
dd
With this command you can create ISO images and floppy images. Example command to create an ISO CD/DVD images is "dd if=/dev/cdrom of=/vmfs/local/suse82pro-dvd.iso bs=20480". For diskettes, use "if=/dev/fd0", and "bs=512".
cat
ConCATenate file from start to standard output (terminal screen by default). Usually takes filename as a parameter.
ls
LiSt files in a directory. -R makes it recursive, and -l shows more information on each item.
stat
Show statistics of a file. This is the most comprehensive directory entry examiner.
tac
Like "cat", but starts from the end of the file (or standard input).
head
Show selected amount of lines from the start of a file.
tail
Like "head", but start from the end of the file. Practical command to follow what is happening with a log file is command like "tail -f /var/log/messages".
grep
Search for a string from standard input or from a file. This is a powerful command.
find
Find files by name or many of the other attributes. Another very powerful command. Below are some example commands:
find / -type f -name *.bak
find . -type d -name sbin
find / -type f -name *
tar
Tape ARchive, a command which combines many files into one for backup purposes. Below are some example commands:
tar -cvzf /vmfs/local/esx.tar.gz --except /proc --except /vmfs /
tar -cf /vmfs/local/vm-configs.tar /data
tar -xvzf /vmfs/local/vm007-config.tar.gz
gzip|gunzip
These command compress and decompress files. The recommended and default extension is .gz.
more|less
These commands usually act in a pipe. They are used for file pagination to terminal
ntpdate
This command takes an NTP server as a parameter and synchronises the clock once. This command doesn't work when local NTP daemon is running. Example: ntpdate ntp1.funet.fi

Network: Console OS (Linux) security in production use

This discussion presumes the default high security level is set during the initial VMware ESX Server configuration. In this security level only the following ports are open on the Console OS:

22
SSH daemon listens to this port for remote connections. By default password authentication is used for logons. RSA/DSA public/private key authentication can be used and it is actually tried first. Userid/password authentication is actually tried second. For higher security and for automated/scripted logons RSA/DSA authentication must be used.
902
VMware authd, the web management UI and remote console authentication daemon (service) for VMware ESX Server uses this port. The daemon is not listening this port directly, but xinetd does. When someone open connection to port 902, xinetd then launches authd, and the actual authentication starts. Xinetd-related authd security is defined in the file /etc/xinetd.d/vmware-authd.
80 and 443
The httpd.vmware application web server listens to these ports. With high security on, all connections to port 80 are automatically redirected to port 443.
8222 and 8333
These ports are used by ESX Server's web UI. They are just forwards to ports 80 and 443 respectively.

Remember, that sshd is by default always running on the Console OS, so you can always connect to it and do low level management directly to the Console OS files.

Backup VMs to other host with sshd on the Console OS

Please note, that from the "other host", where you have backed up the VMs, the VMs can be backed up to tape, etc.

This discussion presumes the following: Now the following shell script can be used on the "other host" to fetch a VM passed as a parameter to the script:
#!/bin/bash
# Fetch files related to VM $1 from VMware ESX Server using userid 'backup'.
export vmname=$1
mkdir /backup/esx01.trivore.com/${vmname} 2>/dev/null
cd /backup/esx01.trivore.com/${vmname}
scp -p backup@esx01.trivore.com:/vmfs/${vmname}/* .

Name the above script "/usr/local/bin/esx-vm-backup", and execute it with parameter "vm001", or any other VM name. Example command: /usr/local/bin/esx-vm-backup vm001