omniverse theirix's Thoughts About Research and Development

The tale of automating BibDesk

bibtex

For organising scientific publications I use a standard LaTeX tool bibtex. It is wise to switch later to the biblatex which handles UTF-8 better but it is a different story. Bibtex specifies a file format for publications with a lot of standard and custom fields where each field actually is a text key-value pair. For example (wikipedia):

@Book{hicks2001,
 author    = "von Hicks, III, Michael",
 title     = "Design of a Carbon Fiber Composite Grid Structure for the GLAST
              Spacecraft Using a Novel Manufacturing Technique",
 publisher = "Stanford Press",
 year      =  2001,
 address   = "Palo Alto",
 edition   = "1st",
}

You can edit bibtex file by hands but there are some good programs to present and edit publications such as JabRef, Mendeley and BibDesk. I use BibDesk because of its good user interface and integration to OS X.

I am using a lot of specific bibtex tasks for my researches. For example, I often need to find bibtex publication id or title by filename and vice versa, grep citations, set PDF title and author fields from a publication and publishing missing publications to my Kindle via Calibre - a lot of small tasks that require reasonable amount of time and need to be automated. And I prefer a command line utility for these tasks. It accepts a command verb and optional argument and provides a list of strings as an output. Unix way rocks.

Scripting BibDesk

A few years ago I found an exhausted AppleScript support for BibDesk. I could write a script for each of these tasks. Anyone who ever wrote an apple script could understand a complexity of writing a complex data processing applescript. I wanted to wrote a script in a more friendly language, effectively any other language, preferable Ruby or Python or plain C or Java.

Scripting Bridge version

First version of script was written in “Scripting Bridge”. But it stopped working when required MacRuby died. MacRuby was needed because Scripting Bridge is based on Cocoa and the only good way to use Cocoa from Ruby is using MacRuby. MacRuby development halted in 2012.

Appscript version

Second version was rewritten in (ruby appscript)[http://appscript.sourceforge.net/rb-appscript/]. Script was nice except the part where each domain object needs to be manually extracted from the scripting object using .get call. Here is the example for task providing files by citation string:

def files_for_cite cite_str
    app("BibDesk").documents.first.publications.get
        .find { |pub| pub.cite_key.get == cite_str }.linked_files.get
        .map { |f| f.url.get }.compact
end

It uses a typical pseudo-functional Ruby chained call that can filter, map and zip things. Then in 2014 I realised that script does not work in Mavericks and Yosemite. Appscript simply does not builds here due to missing symbols. Official page says it is dead too.

Swift version

A week ago I made a third version of the script. Currently Apple Script supports binding to Objective C, JavaScript using Scripting Bridge. I do not know JavaScript enough to master a script and do not like it at all. Writing a script in Objective C is possible but very verbose. So I got a new Apple language called Swift. Apple positioned Swift as a replacement for Objective C that could work with existing codebase and improve a lot syntax and safety of Objective C. It is good for my purpose!

First of all it’s needed to generate a binding from Scripting Bridge to Swift using an experimental project SwiftScripting. Objective C bindings are supported out of box. Then you need to fix bindings by hands because (see below) it is experimental. Then just write a Swift script.

The same script in Swift looks like:

func files_for_cite(cite: String) -> [String] {
    let app: BibDeskApplication =
        SBApplication(bundleIdentifier: "edu.ucsd.cs.mmccrack.bibdesk")
    let pubs = (app.documents!().get()[0] as! BibDeskDocument).publications!()
        .get() as! [BibDeskPublication]
    let pub = pubs.filter({ pub in pub.citeKey! == cite }).first!
    return (pub.linkedFiles!() as [AnyObject])
        .map({ (f: AnyObject) in f as! BibDeskLinkedFile })
        .filter { x in x != nil }.map { x in x! }
}

Huh! I even cannot make a single chained call because amount of braces became astronomical. Technically Swift specifically encouraged chained calls but they seems very cumbersome because of static type system of Swift where we need to cast proxy chain objects to specific types. Swift functional capabilities are limited to the very weak Foundation and Cocoa library support while Ruby has a lot of useful functions in Enumerable and Array. Sometimes I just wrote a matching replacement for Ruby function for more direct porting.

Swift impressions

It took one or two hours for reading Swift manual and Stack Overflow questions and a few hours to rewrite and debug a dozen of tasks to Swift. Major problems in porting were unwrapping values and type casting.

Optional types are pretty good and could protect you from raw pointers usage and NPEs. It is a little similar to Rust optional enums but with added syntax sugar (question and exclamation marks).

Swift could automatically deduce type from right-hand expressions so variable declaration does not need a type. If type-deduction is not possible you need to manually cast type using as operator. Casting became a nightmare because Scripting Bridge provides only untyped pointers that required casting from/to AnyObject. And it looks like sad programming in Java 1.4 with non-generic containers.

So Swift is a pretty language that is objectively better than Objective C :) It has nice features that simplify existing code and improve its safety and readability. Programming Scripting Bridge in Swift is not very comfortable but entirely possible. Seems like it is the only sane way to script OS X applications without dealing with Apple Script or Objective C syntax.