Checking out a single branch from a GitHub repo
If we need only a single code branch, e.g., "experimental
", use:
git clone --single-branch --branch experimental https://github.com/[name]/[repo].git localrepo
Searching for non-ASCII characters in a file
Most online solutions involve grep -P -n "[\x80-\xFF]"
but these solutions do not work on Mac OS X or BSD variants of grep
. Instead, use Perl for a more portable solution:
perl -ne 'print if /[^[:ascii:]]/' filename.txt
Tags: software, textprocessing, Unix, MacOS
Sampling a large text file
Given a large enough text file, sampling solutions like the shuf
utility will run out of memory. For statistical sampling, use one line of awk:
awk 'BEGIN {srand()} !/^$/ { if (rand() <= 0.01) print $0}' input.txt > output.txt
This returns about 1% of the lines in the file. For an exact number of lines, use a higher sampling ratio and | head -n
.
Tags: software, textprocessing, Unix
Shrinking large PDF images
So you have some large (MB-sized) PDF images and you need to reduce them in size, maybe because arXiv requires images to be compressed. Starting with a file f1.pdf (1031756 bytes):
- Use ImageMagick to compress to JPG or PNG and then re-encode as PDF. The JPG compression is very efficient, but the PDF re-encoding is not.
convert f1.pdf -format JPG -quality 50 f1a.jpg → 78532 bytes (7.6%) convert f1.pdf -format JPG -quality 10 f1a.pdf → 758028 bytes (73%) convert f1.pdf -format JPG -quality 90 f1a.pdf → 758028 bytes (73%) convert f1.pdf -format PNG -quality 50 f1a.pdf → 758028 bytes (73%)
- ImageMagick output can be processed with jpeg2ps and then epstopdf for better results:
convert f1.pdf -format JPG -quality 50 f1a.jpg jpeg2ps f1a.jpg > f1a.eps epstopdf f1a.eps → 81228 bytes (8%)
- Use Ghostscript with the /screen or /ebook PDF output settings.
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 \ -dPDFSETTINGS=/screen -sOutputFile=f1b.pdf f1.pdf → 176120 bytes (17%)
The /ebook setting output was nearly the same size as the /screen output. - On Mac OS X (10.14.5 Mojave), exporting from Preview with the "Reduce File Size" Quartz filter gave excellent results (105368 bytes, 10%). This is harder to access from the command line, but the default filter is at:
/System/Library/Filters/Reduce File Size.qfilter
and the ColorSync utility can create modified versions of that filter in the ~/Library/Filters folder.
Preview's Save as JPG also gives very good results:Quality setting = 7 [1...9] → 166649 bytes (16%) Quality setting = 5 [1...9] → 87077 bytes (8.5%)
- Adobe Acrobat Pro has a PDF Optimizer, but does not give as compact results even with 72 DPI output and minimum JPG quality settings (412528 bytes, 40%). Photoshop and Illustrator can produce fairly compact JPGs that can be wrapped back into PDF files (as above), but Preview offers a simple and good enough solution.
Bottom line: Use Preview for conversions by hand, or use the ImageMagick convert utility for JPG output, then wrap it as PDF via jpeg2ps and epstopdf, if scripting is required.
Tags: software, MacOS, graphics, PDF
What's inside a mystery software package file?
A package (.pkg file in OS X) is an .xar archive containing a cpio.gz archive of installable files in "Payload", along with a "bill of materials", scripts, etc. To inspect the contents, unpack the .xar into a directory, and then open the Payload file:
mkdir scratch; cd scratch xar -xf ../mystery.pkg gunzip -dc Payload | cpio -i
The file hierarchy shows where the contents of the package would be distributed during installation.
Upgarding Python packages with Anaconda
To update all installed packages:
conda update conda conda update --all
To upgrade to a new distribution:
conda update conda conda update anaconda
This is a stable release, but usually not what we're looking for.
No sound on my Mac
Switching the output sound device repeatedly (e.g., going back and forth between external speakers and headphones several times) sometimes kills sound output on my laptop (currently on OS X 10.13 High Sierra). To fix, restart the Core Audio services:
sudo killall coreaudiodAnnoying, but workable.
Tags: MacOS, bug-workaround
Why won't my Mac go to sleep?
On Mac OS X 10.13 (High Sierra), power management status is reported by:
pmset -g pmset -g assertionsItems with non-zero assertions (like "UserIsActive") are preventing sleep.
Tags: MacOS
How is this blog generated?
This web log is generated by a modified version of BashBlog, a simple Bash script blogging engine. My version, with customized CSS and global variables, is available here.