Unix Problem: zip Utility Path and Unix Environment Variables

By Xah Lee. Date: . Last updated: .

This page describes a problem with unix's zip utility that's related to the problem of unix's environment variable.

Unix has a command line zip utility to compress files and folders. For example, suppose you want to archive this folder: c:/Users/x/xyz/. All you have to do is “cd” to the parent folder c:/Users/x/, then type zip -r xyz.zip xyz. A archive file named xyz.zip will be created.

Problem

But suppose you need to call “zip” in a program. I know the dir i want to archive, and i know the dir i want the archived output to be.

Suppose in your program, you have:

zip -r "c:/Users/x/output/xyz.zip" "c:/Users/x/xyz"

This will create the archive.

However, there's a problem. In the archive, it records the full path of each file. So, when user tries to unzip xyz.zip on her machine in her current dir at c:/Users/joe/Downloads/, it'll try to create the files and dir at c:/Users/x/xyz or create them at c:/Users/joe/Downloads/Users/x/xyz. Worse, if you use relative paths in your program, then some unzip software will claim it's a error.

Problem with Unix Environment Variable

There does not seem to be a option in the unix zip command line utility to solve this. The best you can do, is in your script, change the current path then call zip -r xyz.zip xyz just as if you are doing it manually.

The problem with this is that, once you introduce the “current dir” environment variable in your code, you have to be careful for every line of your code that deals with directories. Because env var are global variables, and wrong value of “current dir” will effect all functions that takes relative dir as path. This is especially important for build scripts that deals with lots directories. If you forgot to set current dir before a particular function call that takes relative path, you might deleting dir or files.

I got a bug report for our ErgoEmacs software's zip archive file. Using the default unzip utility in Windows 7, and also the “7-Zip v9.17”, it claims that my zip archive is empty.

The problem is caused by relative path in my zip archive, which is caused by using relative paths when calling zip utility like this zip ErgoEmacs_v1.9.zip ../. If using unix unzip, it gives a warning but otherwise works.

So, i tried to fix my build script. Spent 2 hours and realized there is no option in the zip util to do what i want. So, in the end, i set current dir to the parent dir of the dir i want to zip, full aware of the danger, yet, the first mistake i got is that it copied my entire ~/ dir (a few gigabytes, was wondering why it took so long). Even fully aware of the danger, been stung just now, but again somehow i made a mistake. This time, my script deleted my entire svn working dir. What PAIN.

The essence of this problem is unix's concept of current dir as a environment variable. The environment variable is a global variable. Env var and current dir might be unavoidable and useful concept for the OS, but the essential problem of unix is that unixers do not realize the full extend of env var, thus we have a problem like the zip util today, where it does not have option to clearly understand the input path, output path, and paths in archives.

What is the Big Deal?

Some unixer might say “what's the big deal?”. The problem is that as a software programer, you spend hours daily on seemingly trivial problems, caused by million of these little things.

If you do not introduce environment variable into your program, it is impossible to create the correct zip archive. If you do introduce env var, you basically introduced a pest into your script, that you have to be careful on every line to set the current dir correctly. If you forgot (which is normal), it causes disaster. If you do functional programing, or using a functional lang, this is a case of so-called “impedance mismatch”. (for example, some perl or python hotheads insists that their lang can do functional programing, citing esoteric techniques, libraries, to do even advanced functional programing. However, in practice, if you actually try to code in functional style, littered thru-out the lang are little problems that fight you.)

See also:

For the Record: Version Number and Test Files

For testing purpose, here's the “bad” zip file with relative path in it: http://ergoemacs.googlecode.com/files/ergoemacs_1.9.1.zip

The version of unzip program is:

unzip --help
unzip --help
UnZip 6.00 of 20 April 2009, by Cygwin. Original by Info-ZIP.

The zip version is:

zip --help
zip --help
Copyright (c) 1990-2008 Info-ZIP - Type 'zip "-L"' for software license.
Zip 3.0 (July 5th 2008). Usage:

They are both from Cygwin. I'm running Windows Vista.

Another unix faakup is that, right now i tried to find the version of zip and unzip that i'm using, for the record of this article. There does not seem to be any option to print the version number. It does not seem to be documented in the man pages. In the end, it appears, the version number is printed when you use the “--help” option. (the “--help” option is not even mentioned in the “unzip”'s man page.)

Another problem with unix is that, its documentation (man page) always start like this:

SYNOPSIS
       zip  [-aABcdDeEfFghjklLmoqrRSTuvVwXyz!@$] [--longoption ...]  [-b path]
       [-n suffixes] [-t date] [-tt date] [zipfile [file ...]]  [-xi list]

who the faak understand what gook it is talking about? [see Idiocy of Computer Language Docs: Unix, Python, Perl, Haskell]

For the record, this is my elisp build script as it currently is.

; -*- coding: utf-8 -*-

;; 2009-10-01, 2010-11-15
;; This elisp script builds a ErgoEmacs elisp package.
;; Effectively, it creates a new zip file, nothing else.

;; This script is experimental. Best to use the make util at
;; ergoemacs/Makefile
;; for now.

;; What does it do:
;; copy the whole “ergoemacs” dir into some dest dir. The “ergoemacs” is the dir from root checked out from svn.
;; remove all .svn dirs.
;; remove other files and dir such as Makefile and win32-setup etc.

;; HOW TO RUN IT
;; First, change the version number in variable “zipDirName”.
;; then, just eval-buffer.
;; The result will be a new zip file (and a unzipped dir) at the root of your svn checkout.
;; For example, if your svn checkout path is
;;   c:/Users/xah/ErgoEmacs_Source
;; then the following are created
;;   c:/Users/xah/ErgoEmacs_Source/ergoemacs_1.9.1
;;   c:/Users/xah/ErgoEmacs_Source/ergoemacs_1.9.1.zip

;; This script requires unix “find”, “rm”, “cp”, etc.

(defvar zipDirName nil "the zip file/dir name")
(setq zipDirName "ergoemacs_1.9.1.1")

(defvar sourceDir nil "The ergoemacs source code dir in repository. By default, this is parent dir of the dir this file is in.")
(setq sourceDir (expand-file-name  (concat (file-name-directory buffer-file-name) "../")) ) ; for example: "c:/Users/xah/ErgoEmacs_Source/ergoemacs/"

(defvar destDirRoot nil "The output dir. Will be created if doesn't exit. By default, this is 2 dir above this file.")
(setq destDirRoot (expand-file-name  (concat (file-name-directory buffer-file-name) "../../"))) ;

(setq destDirWithZipPath (concat destDirRoot zipDirName "/"))

;; set to absolute path if not already
(setq sourceDir (expand-file-name sourceDir ) )
(setq destDirRoot (expand-file-name destDirRoot ) )
(setq destDirWithZipPath (expand-file-name destDirWithZipPath ) )

;; main

;; if previous build dir and zip file exist, remove them.
(let ()
  (if (file-exists-p destDirWithZipPath) (delete-directory destDirWithZipPath t))
  (if (file-exists-p (concat destDirWithZipPath ".zip" )) (delete-file (concat destDirWithZipPath ".zip" )) ) )

;; create the new dest dir
(make-directory destDirWithZipPath t)

;; copy stuff over to dest dir
;; (shell-command (concat "cp -R " sourceDir " " destDirRoot) )
(copy-directory sourceDir destDirWithZipPath )

;; delete “.svn” dir and other files we don't want
(shell-command (concat "find " destDirWithZipPath " -depth -name \".svn\" -type d -exec rm -R {} ';'" ) )

;; (require 'find-lisp)
;; (mapc 'my-process-file
;;  (find-lisp-find-files destDirWithZipPath "\\.svn$")
;;  (find-lisp-find-files "c:/Users/xah/xx2/ergoemacs_1.9.1.1/build-util/" "")
;;  (find-lisp-find-dired-subdirectories "c:/Users/xah/xx2/ergoemacs_1.9.1.1/build-util/")
;; )

;; delete emacs backup files
;; (shell-command (concat "find " destDirWithZipPath " -name \"*~\" -exec rm {} ';'" ) )
(require 'find-lisp)
(mapc 'delete-file (find-lisp-find-files destDirWithZipPath "~$"))

;; delete Windows specific setup dir
;; (shell-command (concat " rm -R " destDirWithZipPath "win32-setup"))
(delete-directory (concat destDirWithZipPath "win32-setup") t)

;; delete misc files we dont need
(delete-file (concat destDirWithZipPath "Makefile"))
(delete-file (concat destDirWithZipPath "build-util/build_ergoemacs_package.el"))

;; byte compile elc files
(load-file (concat destDirWithZipPath "build-util/byte-compile_lisp_files.el"))

;; zip it
(cd destDirRoot)
(shell-command (concat "zip -r " zipDirName ".zip " zipDirName ) )

;; change current dir back
(cd (expand-file-name (file-name-directory buffer-file-name)))

;; TODO
;; ideally, change all shell calls to elisp functions so it's not dependent on shell.
;; using elisp for build is just experimental. We can revert to unix shell in the future.

;; currently, the version number is hard coded. We probably want to make use svn's tag feature for version stapm, for building both Windows release and elisp package release.