How to write a good debian/watch easily
Short URL: http://bit.ly/debian_watch
Updates:
2016-10-16:
- Added a section about ‘fake watches’.
- Changed all versions from ‘version=3’ to ‘version=4’.
2015-08-25:
- Added a section about ‘merging uversionmangle and dversionmangle’.
Writing a good watch file for a Debian package is simple. However, several maintainers don’t know how to make watches for non trivial sites as Google Pages or Github. Then, my objective is teach a bit about debian/watch.
A conventional debian/watch has two lines: version and site watcher. An example for a simple and trivial site:
version=4 http://f00l.de/pcapfix/pcapfix-(.+)\.tar\.gz
The first line is compulsory and says to Debian watch system (in Debian official project sites) what is the used version and how to behave. Currently, this line is always “version=4”. Then, forget this. The last line (site watcher) is responsible by scan the site, searching for versions of the codes (tarballs). This line was made using a simple observation and regular expressions (Perl regular expressions). So, in the upstream site (http://f00l.de/pcapfix), we can point a download shortcut to see the file URL. Please, see the picture below that shows this situation:
PS: to get a summary about Perl regular expressions, go to the site http://www.cs.tut.fi/~jkorpela/perl/regexp.html.
Is possible to see the web link f00l.de/pcapfix/pcapfix-0.7.3.tar.gz and we will try to use it. We should change the version from 0.7.3 to (.+). In Perl regular expressions, .* is treated as anything, except newline, one or more times. Then, we want monitor pcapfix-ANYTHING.tar.gz. The brackets are used to alert the watch system that we are considering as version what can be seen among them. The final situation is http://f00l.de/pcapfix/pcapfix-(.+)\.tar\.gz. Note that the last part (after the last slash) has a regexp and the points was protected by backslashes to avoid an interpretation as any character.
PS: Some people like to use (.*). However, in regular expressions, this is treated as nothing or anything. Then, http://f00l.de/pcapfix/pcapfix-(.*)\.tar\.gz will match with pcapfix-.tar.gz and you should avoid this.
To test a watch file, we need run ‘uscan command’ from upstream code place (outside of the debian directory, but it must exist; debian/changelog and debian/watch are fundamental to test because uscan will get the local version from the first and compare with results generated by the last).
$ uscan --verbose --report
The result:
eriberto@canopus:/tmp/pcapfix-0.7.3$ uscan --verbose --report -- Scanning for watchfiles in . -- Found watchfile in ./debian -- In debian/watch, processing watchfile line: http://f00l.de/pcapfix/pcapfix-(.+)\.tar\.gz -- Found the following matching hrefs: pcapfix-0.7.3.tar.gz pcapfix-0.7.2.tar.gz pcapfix-0.7.1.tar.gz pcapfix-0.7.tar.gz pcapfix-0.6.tar.gz pcapfix-0.5.tar.gz pcapfix-0.4.tar.gz pcapfix-0.3.tar.gz pcapfix-0.2-real.tar.gz pcapfix-0.1.tar.gz Newest version on remote site is 0.7.3, local version is 0.7.3 => Package is up to date -- Scan finished
How you can see, all versions were shown and the package is using the lastest release. This was a very easy work. However, we can use another technique, which is more sophisticated. This technique is useful for sites more complex as Google Pages and Github.
The site shown in the last picture has web links. Then, we can collect these web links. To do this, initially, we must use the URL of the site that has the links and (.*). Note that, this time, (.*) will be used to a temporary action only. An example:
version=4 http://f00l.de/pcapfix (.*)
After the uscan command, the output was:
eriberto@canopus:/tmp/pcapfix-0.7.3$ uscan --verbose --report -- Scanning for watchfiles in . -- Found watchfile in ./debian -- In debian/watch, processing watchfile line: http://f00l.de/pcapfix (.*) -- Found the following matching hrefs: / / /blog/ /blog/ /pcapfix/ /pcapfix/ /genmenu /genmenu /hacking/ /hacking/ /impressum.html /impressum.html /hacking/pcapfix.php /hacking/pcapfix.php pcapfix-0.4.png pcapfix-0.4.png /hacking/pcapfix.example.txt /hacking/pcapfix.example.txt mailto:ruport@f00l.de mailto:ruport@f00l.de pcapfix-0.7.3.tar.gz pcapfix-0.7.3.tar.gz pcapfix-0.7.3-win32.zip pcapfix-0.7.3-win32.zip changelog-0.7.3.txt changelog-0.7.3.txt pcapfix-0.7.2.tar.gz pcapfix-0.7.2.tar.gz pcapfix-0.7.2-win32.zip pcapfix-0.7.2-win32.zip changelog-0.7.2.txt changelog-0.7.2.txt pcapfix-0.7.1.tar.gz pcapfix-0.7.1.tar.gz pcapfix-0.7.1-win32.zip pcapfix-0.7.1-win32.zip changelog-0.7.1.txt changelog-0.7.1.txt pcapfix-0.7.tar.gz pcapfix-0.7.tar.gz pcapfix-0.7-win32.zip pcapfix-0.7-win32.zip changelog-0.7.txt changelog-0.7.txt pcapfix-0.6.tar.gz pcapfix-0.6.tar.gz pcapfix-0.6-win32.zip pcapfix-0.6-win32.zip changelog-0.6.txt changelog-0.6.txt pcapfix-0.5.tar.gz pcapfix-0.5.tar.gz pcapfix-0.5-win32.zip pcapfix-0.5-win32.zip changelog-0.5.txt changelog-0.5.txt pcapfix-0.4.tar.gz pcapfix-0.4.tar.gz pcapfix-0.4-win32.zip pcapfix-0.4-win32.zip changelog-0.4.txt changelog-0.4.txt pcapfix-0.3.tar.gz pcapfix-0.3.tar.gz changelog-0.3.txt changelog-0.3.txt pcapfix-0.2-real.tar.gz pcapfix-0.2-real.tar.gz changelog-0.2.txt changelog-0.2.txt pcapfix-0.1.tar.gz pcapfix-0.1.tar.gz mailto:info@f00l.de mailto:info@f00l.de dpkg: error: version 'mailto:ruport@f00l.de' has bad syntax: epoch in version is not number Newest version on remote site is mailto:ruport@f00l.de, local version is 0.7.3 dpkg: error: version 'mailto:ruport@f00l.de' has bad syntax: epoch in version is not number => Newer version available from http://f00l.de/pcapfix/mailto:ruport@f00l.de -- Scan finished
Note that all web links in the page are shown. Then, we can use another format in debian/watch to filter the result:
version=4 <URL> <expression to grep>
Yes! You can ‘grep’ the result (and apply Perl regexp). Then, you may use (observe the space between URL and the regexp):
version=4 http://f00l.de/pcapfix pcapfix-(.+)\.tar\.gz
See the result:
eriberto@canopus:/tmp/pcapfix-0.7.3$ uscan --verbose --report -- Scanning for watchfiles in . -- Found watchfile in ./debian -- In debian/watch, processing watchfile line: http://f00l.de/pcapfix pcapfix-(.+)\.tar\.gz -- Found the following matching hrefs: pcapfix-0.7.3.tar.gz pcapfix-0.7.3.tar.gz pcapfix-0.7.2.tar.gz pcapfix-0.7.2.tar.gz pcapfix-0.7.1.tar.gz pcapfix-0.7.1.tar.gz pcapfix-0.7.tar.gz pcapfix-0.7.tar.gz pcapfix-0.6.tar.gz pcapfix-0.6.tar.gz pcapfix-0.5.tar.gz pcapfix-0.5.tar.gz pcapfix-0.4.tar.gz pcapfix-0.4.tar.gz pcapfix-0.3.tar.gz pcapfix-0.3.tar.gz pcapfix-0.2-real.tar.gz pcapfix-0.2-real.tar.gz pcapfix-0.1.tar.gz pcapfix-0.1.tar.gz Newest version on remote site is 0.7.3, local version is 0.7.3 => Package is up to date -- Scan finished
Another example. Please, see the site https://sites.google.com/site/doctormike/pacman.html, that provides the game Pacman for Console (pacman4console in Debian). The picture below shows a link in the site:
The first idea is to use something like this:
version=4 https://sites.google.com/site/doctormike/pacman-(.+)\.tar\.gz\?attredirects=0
or
version=4 https://sites.google.com/site/doctormike/pacman-(.+)\.tar\.gz.*
But the results are:
eriberto@canopus:/tmp/pacman4console-1.2$ uscan --verbose --report -- Scanning for watchfiles in . -- Found watchfile in ./debian -- In debian/watch, processing watchfile line: https://sites.google.com/site/doctormike/pacman-(.+)\.tar\.gz\?attredirects=0 uscan warning: In debian/watch, no matching hrefs for watch line https://sites.google.com/site/doctormike/pacman-(.+)\.tar\.gz\?attredirects=0 -- Scan finished
and:
eriberto@canopus:/tmp/pacman4console-1.2$ uscan --verbose --report -- Scanning for watchfiles in . -- Found watchfile in ./debian -- In debian/watch, processing watchfile line: https://sites.google.com/site/doctormike/pacman-(.+)\.tar\.gz.* uscan warning: In debian/watch, no matching hrefs for watch line https://sites.google.com/site/doctormike/pacman-(.+)\.tar\.gz.* -- Scan finished
Then, we must use the URL scan method. In a first time use:
version=4 https://sites.google.com/site/doctormike/pacman.html (.*)
The output:
eriberto@canopus:/tmp/pacman4console-1.2$ uscan --verbose --report -- Scanning for watchfiles in . -- Found watchfile in ./debian -- In debian/watch, processing watchfile line: https://sites.google.com/site/doctormike/pacman.html (.*) -- Found the following matching hrefs: https://sites.google.com/site/doctormike/pacman-1.2.ebuild?attredirects=0 https://sites.google.com/site/doctormike/pacman-1.2.tar.gz?attredirects=0 https://sites.google.com/site/doctormike/pacman-1.1.tar.gz?attredirects=0 https://sites.google.com/site/doctormike/pacman-1.0.tar.gz?attredirects=0 https://sites.google.com/site/doctormike/pacman-1-1.png?attredirects=0 https://www.google.com/a/UniversalLogin?service=jotspot&continue=https://sites.google.com/site/doctormike/pacman.html /site/doctormike/system/app/pages/reportAbuse javascript:; /site/doctormike/system/app/pages/removeAccess http://sites.google.com dpkg: error: version 'javascript:;' has bad syntax: epoch in version is not number Newest version on remote site is javascript:;, local version is 1.2 dpkg: error: version 'javascript:;' has bad syntax: epoch in version is not number => Newer version available from https://sites.google.com/site/doctormike/javascript:; -- Scan finished
Now, we can make a ‘grep’ based in https://sites.google.com/site/doctormike/pacman-1.2.tar.gz?attredirects=0 or, simply, .*pacman-1.2.tar.gz.* (it is a grep, man!). Four solutions, among others, are:
version=4 https://sites.google.com/site/doctormike/pacman.html https://sites.google.com/site/doctormike/pacman-(.+)\.tar\.gz\?attredirects=0 version=4 https://sites.google.com/site/doctormike/pacman.html https://sites.google.com/.*/pacman-(.+)\.tar\.gz.* version=4 https://sites.google.com/site/doctormike/pacman.html .*/pacman-(.+)\.tar\.gz.* version=4 https://sites.google.com/site/doctormike/pacman.html .*pacman-(.+)\.tar\.gz.*
PS: these examples are available in plain text at http://eriberto.pro.br/files/debian_watch_example.txt.
See the result when using a solution:
eriberto@canopus:/tmp/pacman4console-1.2$ uscan --verbose --report -- Scanning for watchfiles in . -- Found watchfile in ./debian -- In debian/watch, processing watchfile line: https://sites.google.com/site/doctormike/pacman.html .*/pacman-(.+)\.tar\.gz.* -- Found the following matching hrefs: https://sites.google.com/site/doctormike/pacman-1.2.tar.gz?attredirects=0 https://sites.google.com/site/doctormike/pacman-1.1.tar.gz?attredirects=0 https://sites.google.com/site/doctormike/pacman-1.0.tar.gz?attredirects=0 Newest version on remote site is 1.2, local version is 1.2 => Package is up to date -- Scan finished
To Github, I will use as example the project homed at https://github.com/Rup0rt/netmate. In Github, there is a link named release. See the picture below.
If you click over release link, a new page will open and it will show Realeases and Tags links; both have web links that refers to the versions of the releases as, e.g., v0.16. Then, you can use:
version=4 https://github.com/Rup0rt/netmate/releases /Rup0rt/netmate/archive/v(.+)\.tar\.gz
PS: remember that you must use ‘https://github.com/Rup0rt/netmate/releases (.*)’ for initial scan.
The result:
eriberto@canopus:netmate-0.16$ uscan --verbose --report -- Scanning for watchfiles in . -- Found watchfile in ./debian -- In debian/watch, processing watchfile line: https://github.com/Rup0rt/netmate/releases /Rup0rt/netmate/archive/v(.+)\.tar\.gz -- Found the following matching hrefs: /Rup0rt/netmate/archive/v0.16.tar.gz Newest version on remote site is 0.16, local version is 0.16 => Package is up to date -- Scan finished
As a new example, go to the site http://eriberto.pro.br/files/myprogram. Consider that you need to check all Linux versions and they are using tar.bz2, tar.gz and tar.xz extensions. Thus, you need to create a rule with a non-capturing group (?:) feature. You can see an example of this below.
version=4 http://eriberto.pro.br/files/myprogram myprogram-(.+)\.tar\.(?:bz2|gz|xz)
Note that (?:bz2|gz|xz) can be used to match lines that have bz2 or gz or xz but it must be ignored in result. Another solution to this case is:
version=4 http://eriberto.pro.br/files/myprogram/myprogram-(.+)\.tar\.(?:bz2|gz|xz)
If needed, you can select between two or more possibilities. Please, see https://github.com/baruch/diskscan/releases. There are two name patterns to the versions. The respective watch is:
version=4 https://github.com/baruch/diskscan/releases .*/(?:diskscan-)?(\d\S+)\.tar\.(?:bz2|gz|xz)
Note that was used (?:diskscan-)? to make diskscan- optional in results. The \d mean ‘one digit’, as 0-9.
Another resource available is the uversionmangle and dversionmangle. These can be used to modify the vision of the system over the file names. The uversionmangle (u=upstream) is used for upstream site and dversionmangle (d=Debian) is for local file. As an example, consider again the site http://eriberto.pro.br/files/myprogram and local names using ‘+dfsg-1’ suffix, being that ‘-1’ can be another number as ‘-2.3’. Then, a initial idea of the watch is:
version=4 http://eriberto.pro.br/files/myprogram myprogram-(.+)\.tar\.(?:bz2|gz)
The result:
Newest version on remote site is 2.0, local version is 1.0.2-1+dfsg => Newer version available from http://eriberto.pro.br/files/myprogram/myprogram-2.0.tar.gz -- Scan finished
As shown, the local version is being composed by upstream version and ‘-1+dfsg’. We can use dversionmangle to apply a ‘sed’ command to extract ‘-1+dfsg’. The final solution is:
version=4 opts=dversionmangle=s/-\d\+dfsg// \ http://eriberto.pro.br/files/myprogram myprogram-(.+)\.tar\.(?:bz2|gz)
See the output:
Newest version on remote site is 2.0, local version is 1.0.2-1+dfsg (mangled local version number 1.0.2) => Newer version available from http://eriberto.pro.br/files/myprogram/myprogram-2.0.tar.gz -- Scan finished
Now that you understood how to works debian/watch, is the moment to say that the best combination, in general cases, is (\d\S+), instead of (.+). The \d means a digit (0-9). The \S is any non-whitespace character. Then, using (\d\S+), we will have a digit and, at least, a character as a digit, dot etc. Another best practice is accept a possible change of the file extension by the upstream. Then, is good to use \.tar\.(?:bz2|gz|xz) instead of \.tar\.bz2 or \.tar\.gz or similar. See below some examples:
version=4 http://f00l.de/pcapfix/pcapfix-(\d\S+)\.tar\.(?:bz2|gz|xz)
version=4 https://sites.google.com/site/doctormike/pacman.html .*/pacman-(\d\S+)\.tar\.(?:bz2|gz|xz).*
version=4 http://eriberto.pro.br/files/myprogram myprogram-(\d\S+)\.tar\.(?:bz2|gz|xz)
Finishing, you can see and test several examples of the debian/watch at http://anonscm.debian.org/viewvc/sepwatch/trunk/watchfiles. As said, the site http://www.cs.tut.fi/~jkorpela/perl/regexp.html has an overview about regular expressions. A special attention must be destinated to \d and \S matches because it is very used in watch files. The uscan manpage also has several examples and explanations. There are some important information at https://wiki.debian.org/debian/watch too. The post http://stackoverflow.com/questions/3512471/non-capturing-group explains about non-capturing group feature in Perl regular expressions. Another example of this is at http://deadlytechnology.com/web-development-tips/perl-regex. A good tip is the URL http://www.regexe.com, where you can test the results of the your Perl regular expression. Alternatively, there is http://www.regexplanet.com/advanced/perl/index.html but I prefer the previous link.
I hope this help someone to write his debian/watch files.
Merging uversionmangle and dversionmangle
If needed, you can combine uversionmangle and dversionmangle using a comma as separator. See an example:
opts=uversionmangle=s/-src//,dversionmangle=s/\+dfsg// \
Another possibility is concatenate two actions into uversionmangle or dversionmangle using a semicolon. Look at this example:
opts=uversionmangle=s/-src//;s/-versao/-version/ \
Changing the subject, considering that debian/watch uses Perl regular expressions, you can take advantage of the group capture resource. Please, consider the files at http://eriberto.pro.br/files/myprogram2. There are:
myprogram2-0.1.tar.gz myprogram2-0.2-beta.tar.gz myprogram2-0.2.tar.gz myprogram2-alpha-0.1.tar.gz myprogram2-beta-0.1.tar.gz
The first problem is that a program version can not start with a alphabetic character. See an example using an initial debian/watch:
version=4 http://eriberto.pro.br/files/myprogram2 myprogram2-(.+)\.tar\.(?:bz2|gz|xz)
see the result:
$ uscan -- Scanning for watchfiles in . -- Found watchfile in ./debian -- In debian/watch, processing watchfile line: http://eriberto.pro.br/files/myprogram2 myprogram2-(.+)\.tar\.(?:bz2|gz|xz) -- Found the following matching hrefs: myprogram2-0.1.tar.gz (0.1) myprogram2-0.1.tar.gz (0.1) myprogram2-0.2-beta.tar.gz (0.2-beta) myprogram2-0.2-beta.tar.gz (0.2-beta) myprogram2-0.2.tar.gz (0.2) myprogram2-0.2.tar.gz (0.2) myprogram2-alpha-0.1.tar.gz (alpha-0.1) myprogram2-alpha-0.1.tar.gz (alpha-0.1) myprogram2-beta-0.1.tar.gz (beta-0.1) myprogram2-beta-0.1.tar.gz (beta-0.1) dpkg: warning: version '1:beta-0.1-0' has bad syntax: version number does not start with digit Newest version on remote site is beta-0.1, local version is 0.1 dpkg: warning: version '1:beta-0.1-0' has bad syntax: version number does not start with digit => Newer version available from http://eriberto.pro.br/files/myprogram2/myprogram2-beta-0.1.tar.gz -- Scan finished
So, the right upstream version must be 0.1-alpha or 0.1-beta, instead of alpha-0.1 or beta-0.1. The second problem is that, for Debian, 0.1 < 0.1-beta. You can analyse this fact using ‘dpkg –compare-versions’ command. Look at this example (gt = greater than; lt = less than):
$ dpkg --compare-versions 0.1-beta gt 0.1 && echo true true
In Debian world, the versions can use ‘~’ or ‘+’ to establish a right hierarchy:
0.1~beta < 0.1 < 0.1+beta
So, we need to use 0.1~alpha, 0.1~beta, etc. To change the order of the upstream versions, you can use:
version=4 opts=uversionmangle=s/(alpha|beta)-([\d\.]+)/$2~$1/;s/([\d\.]+)-(alpha|beta)/$1~$2/ \ http://eriberto.pro.br/files/myprogram2 myprogram2-(.+)\.tar\.(?:bz2|gz|xz)
In s/(alpha|beta)-([\d\.]+)/$2~$1/ statement, each group delimited by brackets can be represented by $<number>. So, ‘(alpha|beta) = $1’ and ‘([\d\.]+) = $2’. Using $1 and $2, we can invert the order, putting a tilde between the groups. Try to understand the second part of the line, that is, s/([\d\.]+)-(alpha|beta)/$1~$2/. This part will act over 0.2-beta. The final result is:
$ uscan -- Scanning for watchfiles in . -- Found watchfile in ./debian -- In debian/watch, processing watchfile line: opts=uversionmangle=s/(alpha|beta)-([\d\.]+)/$2~$1/;s/([\d\.]+)-(alpha|beta)/$1~$2/ http://eriberto.pro.br/files/myprogram2 myprogram2-(.+)\.tar\.(?:bz2|gz|xz) -- Found the following matching hrefs: myprogram2-0.1.tar.gz (0.1) myprogram2-0.1.tar.gz (0.1) myprogram2-0.2-beta.tar.gz (0.2~beta) myprogram2-0.2-beta.tar.gz (0.2~beta) myprogram2-0.2.tar.gz (0.2) myprogram2-0.2.tar.gz (0.2) myprogram2-alpha-0.1.tar.gz (0.1~alpha) myprogram2-alpha-0.1.tar.gz (0.1~alpha) myprogram2-beta-0.1.tar.gz (0.1~beta) myprogram2-beta-0.1.tar.gz (0.1~beta) Newest version on remote site is 0.2, local version is 0.1 => Newer version available from http://eriberto.pro.br/files/myprogram2/myprogram2-0.2.tar.gz -- Scan finished
Using fake packages to provide a useful debian/watch
Fake watches can be used to avoid a false impression that you ignored the watch file by laziness and to inform an actual upstream status. This is not a Debian original resource; it was my own creation. I created three fake packages. These packages say if there no upstream site, if there no release in upstream site or if the upstream site there a package but doesn’t allow track it (using a JavaScript to block it, for example).
Fake watches are very easy to use. You can see how to use my fake watches here. There are some comments here.
That’s all. Enjoy!
—
It’s had another option to use watch, also to use a debian service know Githubredir. Githubredir is repository redirect of Github. I think that is unnecessary because your text is excelente. If somebody want use, it’s cool. For example, my fortune-marios package.
$cat debian/watch
version=3
http://githubredir.debian.net/github/fike/fortunes-mario/fortunes-mario-(.*).tar.gz
Githubredir:
– http://githubredir.debian.net/
Thanks Fike!!!!
Thanks for this article.
Please note that the regex pattern “(.*)” matches the empty string; so, a pattern like “pcapfix-(.*)\.tar\.gz” will match the string “pcapfix-.tar.gz”. That’s not desirable.
Fortunately, the solution is simple: in the pattern, instead of zero or more characters, require one or more: “pcapfix-(.+)\.tar\.gz”.
The ‘uscan(1)’ manpage and other documentation avoids “.*” and similar patterns for exactly this reason.
Can you please update your article to avoid “(.*)” and instead use “(.+)” in the patterns? Thank you for helping maintainers use the tools correctly!
Thanks a lot Ben.
You are right. Since I wrote this article I changed my concepts and I use \d\S+ now. I will update the article soon.
Hi all. I made a general review in the post now. Thanks!