DXR Documentation

Download Report

Transcript DXR Documentation

DXR Documentation
Release 0.1
various
January 23, 2015
Contents
1
2
Contents
1.1 Welcome To The DXR Community
1.2 Getting Started . . . . . . . . . . .
1.3 Configuration . . . . . . . . . . . .
1.4 Deployment . . . . . . . . . . . .
1.5 Use . . . . . . . . . . . . . . . . .
1.6 Development . . . . . . . . . . . .
.
.
.
.
.
.
3
3
4
6
8
12
15
Back Matter
2.1 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Icon Credits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
21
21
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
i
ii
DXR Documentation, Release 0.1
DXR is a code search and navigation tool aimed at making sense of large projects like Firefox. It supports full-text
and regex searches as well as structural queries like “Find all the callers of this function.” Behind the scenes, it uses
trigram indices, the re2 library, and static analysis data collected by instrumented compilers to make searches faster
and more accurate than is possible with simple tools like grep.
Here’s an example of DXR running against the Firefox codebase. It looks like this:
Contents
1
DXR Documentation, Release 0.1
2
Contents
CHAPTER 1
Contents
1.1 Welcome To The DXR Community
Though DXR got its start at Mozilla, it’s seen contributions from a variety of companies, individuals, and places
around the world. We welcome contributions of code, bug reports, and helpful feedback.
1.1.1 Bug Reports
Did something explode? Not act as you expected? Let us know.
1.1.2 Submitting Patches
To contribute code, just file a pull request. Include tests to double your chances of getting it merged and qualify for a
free Bundt cake. We love tests. Bundt cake isn’t bad, either.
1.1.3 IRC
We hang out in the #static channel of irc.mozilla.org. Poke your head in, and say hello.
If you have questions, please address them to the public channel; don’t /msg someone in particular. That way, more
people have a chance at answering your question, and more people can benefit from hearing the answers. We realize
that no one likes looking naive, but please be brave and set an example to embolden the less-brave naive people. We’re
a friendly bunch and will never deride anyone for being a beginner.
1.1.4 Open Bugs
Looking for something to hack on? Here are...
• Our easy bugs
• All our bugs
Before starting work on a bug, hop into the IRC channel and confirm it’s still relevant. We try to garden our bugs, but
DXR often moves faster than we can weed.
3
DXR Documentation, Release 0.1
1.1.5 Wiki
The wiki is full of roadmap documents, sketches toward future features, and bikeshedding sessions. Feel free to
scribble on it.
1.2 Getting Started
Note: These instructions are suited to trying out DXR to see if you like it. If you plan to contribute code to DXR
itself, please see Development instead.
The easiest way to get DXR working on your own machine is...
1. Get the source code you want to index.
2. Tell DXR how to build it.
3. Run dxr-index.py to build and index your code.
4. Run dxr-serve.py to present a web-based search interface.
But first, we have some installation to do.
1.2.1 Downloading DXR
Using git, clone the DXR repository:
git clone --recursive https://github.com/mozilla/dxr.git
Remember the --recursive option; DXR depends on the TriLite SQLite extension, which is included in the
repository as a git submodule.
1.2.2 Booting And Building
DXR runs only on Linux at the moment (and possibly other UNIX-like operating systems). The easiest way to get
things set up is to use the included, preconfigured Vagrant VM. You’ll need Vagrant and a virtualization provider for
it. We recommend VirtualBox.
Once you’ve installed VirtualBox and Vagrant, run these commands in DXR’s top-level directory:
vagrant plugin install vagrant-vbguest
vagrant up
vagrant ssh
Then, run this inside the VM:
cd ~/dxr
make
Note: The Vagrant image is built for VirtualBox 4.2.0 or newer. If your version is older, the image might not work as
expected.
Your Vagrant version may require a specific vbguest plugin installation method. If you receive errors about the plugin
visit the vbguest plugin page.
4
Chapter 1. Contents
DXR Documentation, Release 0.1
1.2.3 Configuration
Before DXR can index your code, we need to tell it where it is and, if you want to be able to do structural queries
like find-all-the-callers, how to kick off a build. (Currently, DXR supports structural queries only for C and C++.) If
you have a simple build process powered by make, a configuration like this might suffice. Place the following in a file
called dxr.config. The location of the file doesn’t matter; the usual place is adjacent to your source directory.
[DXR]
target_folder
= /path/for/the/output
[yourproject]
source_folder
object_folder
build_command
= /path/to/your/code
= /path/to/your/code
= make clean; make -j $jobs
Note: Be sure to replace the placeholder paths in the above config.
By building your project with clang and under the control of dxr-index.py, DXR gets a chance to interpose a custom
compiler plugin that emits analysis data. It then processes that into an index.
If you have a non-C++ project and simply want to index it as text, the build_command can be set to /bin/true
or some other do-nothing command.
Though you shouldn’t need any of them yet, further config directives are described in Configuration.
1.2.4 Indexing
Now that you’ve told DXR about your codebase, it’s time to build an index (sometimes also called an instance):
dxr-build.py dxr.config
Note:
If you have a large codebase, the VM might run out of RAM. If that happens, make
a copy of the vagrantconfig_local.yaml-dist file in the top-level dxr directory, rename it
vagrantconfig_local.yaml, and edit it to increase the VM’s RAM:
cp vagrantconfig_local.yaml-dist vagrantconfig_local.yaml
vi vagrantconfig_local.yaml
Then restart the VM. Within the VM...
sudo shutdown -h now
Then, from the host machine...
vagrant up
vagrant ssh
Note: If you have trouble getting your own code to index, step back and see if you can get one of the included test
cases to work:
cd ~/dxr/tests/test_basic
make
If that works, it’s just a matter of getting your configuration right. Pop into #static on irc.mozilla.org if you need a
hand.
1.2. Getting Started
5
DXR Documentation, Release 0.1
1.2.5 Serving Your Index
Congratulations; your index is built! Now, spin up DXR’s development server, and see what you’ve wrought:
dxr-serve.py --all /path/to/the/output
Surf to http://33.33.33.77:8000/ from the host machine, and poke around your fancy new searchable codebase.
Note: Seeing this error?
Server Error
Database error: no such module: trilite
Run sudo ldconfig inside the virtual machine to sort out the shared library linking problem. Then, re-run dxr-serve.py,
and all should work as expected.
1.3 Configuration
DXR learns how to index your source trees by means of an ini-formatted configuration file:
[DXR]
target_folder
= /path/for/the/output
[yourproject]
source_folder
object_folder
build_command
= /path/to/your/code
= /path/to/your/code
= make clean; make -j $jobs
It gets passed to dxr-build.py at indexing time:
dxr-build.py my_config_file.config
1.3.1 Sections
The configuration file is divided into sections. The [DXR] section holds global options; other sections describe trees
to be indexed.
You can use all the fancy interpolation features of Python’s ConfigParser class to save repetition.
[DXR] Section
Here are the options that can live in the [DXR] section:
target_folder Where to put the built index. Required.
temp_folder The default container for individual tree-specific temp folders. Default: /tmp/dxr-temp. Recommended to avoid exceeding the size of the /tmp volume and to avoid collisions between concurrent indexing
runs
directory_index Filename for directory
.dxr-directory-index.html
index
files
in
the
generated
static
HTML.
Default:
Resist the temptation to use index.html for directory_index. Any indexed file with the same name
would then shadow the directory index, confusing users.
disabled_plugins Names of plugins to disable. Default: empty
6
Chapter 1. Contents
DXR Documentation, Release 0.1
disable_workers If non-empty, do not use a worker pool for building the static HTML. Default: empty
enabled_plugins Names of plugins to enable. Default: *
filter_lang The default programming language for this instance. Only filters registered for this language will be
used. Default: C
generated_date The “generated on” date stamped at the bottom of every DXR web page, in RFC-822 (also
known as RFC 2822) format. Default: the time the indexing run started
log_folder The default container for individual tree-specific log folders. Default: <temp_folder>/logs.
nb_jobs Number of processes allowed in worker pools. Default: 1. This value can be overwritten by the -j
argument to dxr-build.py.
plugin_folder Folder to search for plugins. Default: <dxr_root>/plugins. This will soon be deprecated
in favor of a new plugin discovery model.
Please note that dxr-build.py assumes the plugins in plugin_folder are already built and ready for use. If
you specify your own plugin folder, the top-level makefile will not take care of this for you.
skip_stages Build/indexing stages to skip: zero or more of index and html, space-separated. Default: none
wwwroot URL path prefix to the root of DXR’s web app. Default: /
google_analytics_key Google analytics key. If set, the analytics snippet will added automatically to every
page.
(Refer to the Plugin Configuration section for plugin keys available here).
Tree Sections
Any section that is not named [DXR] represents a tree to be indexed. Here are the options describing a tree:
build_command Command for building your source code. Default: make -j $jobs. Note that $jobs will be
replaced with nb_jobs from the config file or the value of the -j option from the dxr-build.py command line.
If you define a build_command not containing $jobs, you will be warned, but indexing will continue.
disabled_plugins Plugins disabled in this tree, in addition to ones already disabled in the [DXR] section.
Default: *
enabled_plugins Plugins enabled in this tree. Default: *. It is impossible to enable a plugin not already enabled
in the [DXR] section.
ignore_patterns Space-separated list of Unix shell-style file patterns to ignore.
log_folder Folder for indexing logs. Default: <global log_folder>/<tree>
object_folder Folder where object files will be stored. Required.
source_folder The folder containing the source code to index. Required.
source_encoding The Unicode encoding of the tree’s source files. Default: utf-8
temp_folder Folder for temporary items made during
temp_folder>/<tree>. You generally shouldn’t set this.
indexing.
Default:
<global
1.3.2 Plugin-Specific Options
Options prefixed with plugin_ (except plugin_folder) are reserved for use by plugins. These options can
appear in the global [DXR] section or in tree sections. Plugin developers should name their config options like
plugin_<plugin name>_<option>. (See Writing Plugins for more details on plugin development.)
1.3. Configuration
7
DXR Documentation, Release 0.1
At the moment, all the existing plugin options are valid only in tree sections:
plugin_buglink_name Name of the tree’s bug tracker installation, e.g. Mozilla’s Bugzilla
plugin_buglink_regex Regex for finding bug references to link in the source code.
(?i)bug\s+#?([0-9]+)
Default:
plugin_buglink_url URL pattern for building links to tickets. %s will be replaced with the ticket number. The
option should include the URL scheme.
plugin_omniglot_p4web The URL to the root of a p4web installation. Default: http://p4web/
1.4 Deployment
Once you decide to put DXR into production for use by multiple people, it’s time to move beyond the Getting Started
instructions. You likely need real machines—not Vagrant VMs—and you definitely need a robust web server like
Apache. This chapter helps you deploy DXR on the Linux machines 1 of your choice and configure them to handle
multi-user traffic volumes.
DXR generates an index for one or more source trees offline. This is well suited to a dedicated build server. The
generated index is then transferred to one or more web servers for hosting.
1.4.1 Dependencies
OS Packages
Since you’re no longer using the Vagrant VM, you’ll need to install several packages on both your build and web
servers. These are the Ubuntu package names, but they should be clear enough to map to their equivalents on other
distributions:
• make
• build-essential
• libclang-dev (clang dev headers 3.3 or 3.4)
• llvm-dev (LLVM dev headers 3.3 or 3.4)
• pkg-config
• mercurial (to check out re2)
• libsqlite3-dev
• npm (Node.js and its package manager)
Technically, you could probably do without most of these on the web server, though you’d then need to build DXR on
a different machine and transfer it over.
Note: On some systems (for example Debian and Ubuntu) the Node.js interpreter is named nodejs, but DXR expects
it to be named node. One simple solution is to add a symlink:
sudo ln -s /usr/bin/nodejs /usr/bin/node
Note: The list of packages above is maintained by hand and might fall behind, despite our best efforts. If you suspect
1
8
DXR might also work with other UNIX-like operating systems, but we make no promises.
Chapter 1. Contents
DXR Documentation, Release 0.1
something is missing, look at vagrant_provision.sh in the DXR source tree, which does the actual setup of
the VM and is automatically tested.
Python Packages
You’ll also need several third-party Python packages. In order to isolate the specific versions we need from the rest of
the system, use Virtualenv:
virtualenv dxr_venv # Create a new virtual environment.
source dxr_venv/bin/activate
You’ll need to repeat that activate command each time you want to use DXR from a new shell.
Now, with your new virtualenv active, you can install the requisite packages:
cd dxr
./peep.py install -r requirements.txt
1.4.2 Building
First, if you cannot arrange for the correct versions of llvm-config, clang, and clang++ to be available under those
names, whether by a mechanism like Debian’s alternatives system or with symlinks, you will need to edit the makefile
in dxr/plugins/clang to specify complete paths to the right ones.
Then, build DXR from its top-level directory:
make
It will build the libtrilite.so library in the trilite directory and libclang-index-plugin.so in
dxr/plugins/clang as well as compiling the JavaScript-based templates.
To assure yourself that everything has built correctly, you can run the tests:
make test
1.4.3 Installation
Once you’ve built it, install DXR in the activated virtualenv. This is an optional step, but it lets you call the dxrindex.py and dxr-build.py commands without specifying their full paths, as long as the env is activated.
python setup.py install
It’s also convenient to install the TriLite library globally. Otherwise, dxr-build.py will complain that it can’t find the
TriLite SQLite extension unless you prepend LD_LIBRARY_PATH=dxr/trilite at every invocation. It’s also
a challenge to get a web server to see the lib, since you don’t have a ready opportunity to interpose an environment
variable. To install TriLite...
cp dxr/trilite/libtrilite.so /usr/local/lib/
sudo ldconfig
1.4.4 Indexing
Now that we’ve got DXR installed on both the build and web machines, let’s talk about just the build server for a
moment.
1.4. Deployment
9
DXR Documentation, Release 0.1
As in Getting Started, copy your projects’ source trees to the build server, and create a config file. (See Configuration
for details.) Then, kick off the indexing process:
dxr-build.py dxr.config
Note: You can also pass the --tree TREE option to generate the index for just one source tree. This is useful for
building each tree on a different machine, though it does leave you with the task of stitching the resulting single-tree
indexes together, a matter of moving some directories around and tweaking the generated config.py file.
The index is generated in the directory specified by the target_folder directive. It contains a minimal configuration file, a SQLite database to support search, and static HTML versions of all of the files in the source trees.
Generally, you use something like cron to repeat indexing on a schedule or in response to source tree changes. After
an indexing run, the index has to be made available to the web servers. One approach is to share it on a common NFS
volume (and use an atomic mv to swap the new one into place). Alternatively, you can simply copy the index to the
web server (in which case an atomic mv remains advisable, of course).
1.4.5 Serving Your Index
Now let’s set up the web server. Here we have some alternatives.
dxr-serve.py
The dxr-serve.py script is a tiny web server for publishing an index. Though it is underpowered for production use, it
can come in handy for testing that the index arrived undamaged and DXR’s dependencies are installed:
dxr-serve.py target
Then visit http://localhost:8000/.
As with dxr-build.py above, you can pass an LD_LIBRARY_PATH environment variable to dxr-serve.py if you are
unable to install the TriLite library globally on your system:
LD_LIBRARY_PATH=dxr/trilite dxr-serve.py target
Apache and mod_wsgi
DXR is also a WSGI application and can be deployed on Apache with mod_wsgi, on uWSGI, or on any other web
server that supports the WSGI protocol.
The main mod_wsgi directive is WSGIScriptAlias, and the DXR WSGI application is defined in dxr/wsgi.py, so
an example Apache directive might look something like this:
WSGIScriptAlias / /path/to/dxr/dxr/wsgi.py
You must also specify the path to the generated index. This is done with a DXR_FOLDER environment variable. For
example, add this to your Apache configuration:
SetEnv DXR_FOLDER /path/to/target
As with dxr-build.py and dxr-serve.py above, either pass an LD_LIBRARY_PATH environment variable
to mod_wsgi, or install the libtrilite.so library onto your system globally. Because of the ways
LD_LIBRARY_PATH and mod_wsgi work, adding it to your regular Apache configuration has no effect. Instead,
add the following to /etc/apache2/envvars:
10
Chapter 1. Contents
DXR Documentation, Release 0.1
export LD_LIBRARY_PATH=/path/to/dxr/trilite
Because we used virtualenv to install DXR’s runtime dependencies, add the path to the virtualenv to your Apache
configuration:
WSGIPythonHome /path/to/dxr_venv
Note that the WSGIPythonHome directive is allowed only in the server config context, not in the virtual host context.
It’s analogous to running virtualenv’s activate command.
Finally, make sure mod_wsgi is installed and enabled. Then, restart Apache:
sudo apache2ctl stop
sudo apache2ctl start
Note: Changes to /etc/apache2/envvars don’t take effect if you run only sudo apache2ctl restart.
Additional configuration might be required, depending on your version of Apache, your other Apache configuration,
and where DXR is installed. For example, if you can’t access your DXR index and your Apache error log contains
lines like client denied by server configuration: /path/to/dxr/dxr/wsgi.py, try adding
this to your Apache configuration:
<Directory /path/to/dxr/dxr>
Require all granted
</Directory>
Here is a complete example config, for reference:
WSGIPythonHome /home/vagrant/dxr_venv
<VirtualHost *:80>
# Serve static resources, like CSS and images, with plain Apache:
Alias /static/ /home/vagrant/dxr/dxr/static/
#
#
#
#
#
We used to make special efforts to also serve the static pages of
HTML-formatted source code from the tree via plain Apache, but that
tangle of RewriteRules saved us only about 20ms per request. You can do
it if you’re on a woefully underpowered machine, but I’m not maintaining
it.
# Tell this instance of DXR where its target folder is:
SetEnv DXR_FOLDER /home/vagrant/dxr/tests/test_basic/target/
WSGIScriptAlias / /usr/local/lib/python2.7/site-packages/dxr/dxr.wsgi
</VirtualHost>
uWSGI
uWSGI is the new hotness and well worth considering. The first person to deploy DXR under uWSGI should document
it here.
1.4.6 Upgrading
To update to a new version of DXR...
1. Update your DXR clone:
1.4. Deployment
11
DXR Documentation, Release 0.1
git pull origin master
git submodule update
2. Delete your old virtual env:
rm -rf /path/to/dxr_venv
3. Repeat these parts of the installation:
(a) Python Packages
(b) Building
(c) Installation
1.5 Use
1.5.1 Querying
DXR queries are almost entirely text-based. In addition to being fast to input for experienced users, having an all-text
representation invites handy tricks like Firefox keyword bookmarks.
A DXR query is a series of space-delimited terms:
• Filtered terms are structured as <filter name>:<argument>:
– callers:frobulate
– var:num_caribou
Everything but plain text search is done using filtered terms.
• Text terms are just bare text and do simple substring matching:
– hello
– three independent words
The way terms are combined is somewhat odd and will change in a future version. For now, the behavior is as follows:
terms are ANDed together on a per-file basis and then ORed together on a per-line basis. For example, if you searched
for hairy gerbils, the results would be files containing both the words “hairy” and “gerbils”, but the lines shown
would be ones containing either “hairy” or “gerbils”. The upside is that this makes multi-line comments and strings
easy to find.
Quoting
Single and double quotes can be used in filter arguments and in text terms to help express literal spaces and other
oddities. Singles can contain doubles, doubles can contain singles, and each kind can contain itself if backslashescaped:
• A phrase with a space: "Hello, world"
• Quotes in a plain text search, taken as literals since they’re not leading: id="whatShouldIDoContent"
• Double quotes inside single quotes, as a filter argument: regexp:’"wh(at|y)’
• Backslash escaping: "I don’t \"believe\" in fairies."
12
Chapter 1. Contents
DXR Documentation, Release 0.1
1.5.2 Highlighting
Source code views support highlighting lines, runs of lines, and even multiple runs of lines at once.
There are four ways to highlight. Each updates the hash portion of the URL so the highlighted regions are maintained
when a page is bookmarked or shared via chat, bug reports, etc.
single click Single-click a line to select it. Click it again to deselect it. Single-clicking a line will also deselect all
other lines.
single click then shift-click After selecting a single line, hold Shift, and click a line above or below it to highlight the
entire range between.
control- or command-click Hold Control or Command (depending on your OS) while clicking a line to add it to the
set of already highlighted lines. Do it again to deselect it.
control- or command-click, then shift-click After selecting one or more lines, use Control- or Command-Click to
highlight the first in a new range of lines. Then, Shift-click, and the second range will be added to the existing
highlighted set.
1.5. Use
13
DXR Documentation, Release 0.1
14
Chapter 1. Contents
DXR Documentation, Release 0.1
1.6 Development
1.6.1 Architecture
1.6. Development
15
DXR Documentation, Release 0.1
DXR divides into 2 halves:
1. The indexer, dxr-build.py, is a batch job which analyzes code and builds on-disk indices.
The indexer hosts various plugins which handle everything from syntax coloring to static analysis. The clang
plugin, for example, which handles structural analysis of C++ code, builds the project under clang while interposing a custom compiler plugin. The plugin rides sidecar with the compiler, dumping out structural data into
CSV files, which the DXR plugin later pulls in and uses to generate the SQLite tables that support structural
queries like callers: and function:.
Generally, the indexer is kicked off asynchronously—often even on a separate machine—by cron or a build
system. It’s up to deployers to come up with strategies that make sense for them.
2. A Flask web application which lets users query those indices. The development entrypoint for the web application is dxr-serve.py, but a more robust method should be used for Deployment.
1.6.2 Setting Up
Here we show the fastest way to get hacking on DXR.
Downloading DXR
Using git, clone the DXR repository:
git clone --recursive https://github.com/mozilla/dxr.git
Remember the --recursive option; DXR depends on the TriLite SQLite extension, which is included in the
repository as a git submodule.
Booting And Building
DXR runs only on Linux at the moment (and possibly other UNIX-like operating systems). The easiest way to get
things set up is to use the included, preconfigured Vagrant VM. You’ll need Vagrant and a virtualization provider for
it. We recommend VirtualBox.
Once you’ve installed VirtualBox and Vagrant, run these commands in DXR’s top-level directory:
vagrant plugin install vagrant-vbguest
vagrant up
vagrant ssh
Then, run this inside the VM:
cd ~/dxr
make
Note: The Vagrant image is built for VirtualBox 4.2.0 or newer. If your version is older, the image might not work as
expected.
Your Vagrant version may require a specific vbguest plugin installation method. If you receive errors about the plugin
visit the vbguest plugin page.
16
Chapter 1. Contents
DXR Documentation, Release 0.1
Running A Test Index
The folder-based test cases make decent workspaces for development, suitable for manually trying out your changes.
test_basic is a good one to start with. To get it running...
cd ~/dxr/tests/test_basic
make
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:../../trilite dxr-serve.py -a target
You can then surf to http://33.33.33.77:8000/ from the host machine and play around. When you’re done, stop the
server with Control-C.
1.6.3 Workflow
The repository on your host machine is mirrored over to the VM via Vagrant’s shared-folder magic. Changes you
make outside the VM will be instantly available within and vice versa, so you can edit using your usual tools on the
host and still use the VM to run DXR.
After making changes to DXR, a build step is sometimes needed to see the effects of your work:
Changes to C-based compiler plugins or TriLite: make (at the root of the project)
Changes to HTML templates that are used on the client side: make templates. (This is a subset of make,
above, and may be faster.) Alternatively, leave node_modules/.bin/grunt watch running, and it will
take care of recompiling the templates as necessary.
Changes to server-side HTML templates or the DB schema: Run make inside tests/test_basic.
Stop dxr-serve.py, run the build step, and then fire up the server again. If you’re changing Python code that runs only
at request time, you shouldn’t need to do anything; dxr-serve.py should notice and restart itself a few seconds after
you save.
1.6.4 Testing
DXR has a fairly mature automated testing framework, and all server-side patches should come with tests. (Tests for
client-side contributions are welcome as well, but we haven’t got the harness set up yet.)
Writing Tests for DXR
DXR supports two kinds of tests:
1. A lightweight sort with a single file worth of C++ code. This kind stores the C++ source as a Python string
within a subclass of SingleFileTestCase. At test time, it creates a DXR instance on disk in a temp folder,
builds it, and makes assertions about it. If the should_delete_instance class variable is truthy, it then
deletes the instance. If you want to examine the instance manually for troubleshooting, set this to False.
2. A heavier sort which consists of a full DXR instance on disk. test_ignores is an example. Within these
instances are one or more Python files containing subclasses of DxrInstanceTestCase which express the
actual tests. These instances can be built like any other using dxr-build.py, in case you want to do manual
exploration.
Running the Tests
To run all the tests, run this from the root of the DXR repository:
1.6. Development
17
DXR Documentation, Release 0.1
make test
To run just the tests in tests/test_functions.py...
nosetests tests/test_functions.py
To run just the tests from a single class...
nosetests tests/test_functions.py:ReferenceTests
To run a single test...
nosetests tests/test_functions.py:ReferenceTests.test_functions
If you have trouble, make sure you didn’t mistranscribe any colons or periods. Also, if you did not install
libtrilite.so globally, you’ll need to make sure LD_LIBRARY_PATH in your environment points to the
trilite folder.
1.6.5 The Format Version
At the root level of the repo lurks a file called format. Its role is to facilitate the automatic deployment of new
versions of DXR using a script like the included deploy.py. The format file contains an integer which represents
the instance format expected by the DXR code. If a change in the code requires something new in the instance,
generally (1) differently structured HTML or (2) a new DB schema, the format version must be incremented with the
code change. In response, the deployment script will wait until a new instance, of the new format, has been built before
deploying the change.
If you aren’t sure whether to bump the format version, you can always build an instance using the old code, then check
out the new code and try to serve the old instance with it. If it works, you’re probably safe not bumping the version.
1.6.6 Coding Conventions
Follow PEP 8 for Python code, but don’t sweat the line length too much.
1.6.7 Writing Plugins
Note: DXR is in the middle of a plugin system redesign that will move much of DXR’s core functionality to plugins,
eliminate singletons and custom loading tricks, and increase the capabilities of the plugin API. This section documents
the old system.
Structure and API
A plugin is a folder located in the plugins/ folder. A plugin’s name should not contain dashes or other characters
not allowed in Python module names. Notice that the plugin folder will be added to the search path for modules, so
plugin names shouldn’t conflict with other modules. A plugin may import submodules from within its own plugin
folder if it contains an __init__.py file.
A plugin folder must contain these 3 files:
makefile Build steps for this plugin. This is be a GNU makefile with targets build, check, and clean. These
build dependencies, verify the build, and clean up after it, respectively. Effects of this makefile should, insofar
as possible, remain within the plugin’s subdirectory. If your makefile does anything, be sure to add a reference
to it in the top-level makefile so it gets called.
18
Chapter 1. Contents
DXR Documentation, Release 0.1
indexer.py Routines that generate DB entries to support search
This is a Python module with two functions—pre_process(tree, environ) and
post_process(tree, conn)—where parameters tree and conn are a config for the tree and a
database connection, respectively. The environ parameter is a dictionary of environment variables and may
be modified prior to build using by the pre_process function.
Both functions will be called only once per tree and are allowed to use a number of subprocess as specified by tree.config.nb_jobs. If a plugin wants to store information from pre- or post processing, it can do so in its own temporary directory: each plugin is allowed to use the temporary folder
<tree.temp_folder>/plugins/<plugin-name>. (The temporary folder will remain until htmlification is finished.)
htmlifier.py Routines that emit metadata for building HTML
This is a Python module with two functions: load(tree, conn) and htmlify(path, text). This
module will be used by multiple processes concurrently, but load will be invoked in only one, allowing the
module to load resources into global scope for caching or other purposes.
Once load(tree, conn) has been invoked with the tree config object and database connnection, the
htmlify(conn, path, text) function may be invoked multiple times. The path parameter is the path
of the file in the tree; the text parameter is the file content as a string.
The htmlify function return either None or an object with methods refs(), regions(),
annotations() and links(), which behave as follows:
refs() Yields tuples of (start, end, menu)
regions() Yields tuples of (start, end, class)
annotations() Yields tuples of (line, attributes), where attributes is a dictionary defined
by plugins. It must be sensible to assign the key-value pairs as HTML attributes on a div tag, and class
must contain note note-<type> where type can be used templates to differentiate annotations.
links() Yields tuples of (importance, section, items), where items is a generator of tuples of
(icon, title, href). importance is an integer used to sort sidebar sections.
Note that the htmlifier module may not write to the database. It also strongly recommended that the htmlifier
module doesn’t write to the plugins temporary folder. It is a strict requirement that the htmlifier module may
be loaded and used by multiple processes at the same time. For this reason, the htmlifier is not allowed to have
worker processes of its own.
Crash Early, Crash Often
Since DXR’s indexer generally runs without manual supervision, it’s better to err on the side of crashing than to risk
incorrectness. Any error that could make a plugin emit inaccurate output should be fatal. This keeps DXR’s structural
queries trustworthy.
Configuration
Configuration keys prefixed with plugin_ in either a tree section or the DXR section of the configuration will be
read and stored on the tree and config objects, respectively. Please note that these values will not have any default
values, nor will they be present unless defined in the config file.
It’s the plugins’ responsibility to validate these values.
Plugins should prefix all config keys as
plugin_<plugin-name>_<key>. It’s also recommended that plugins document their keys in the plugin section of Configuration.
1.6. Development
19
DXR Documentation, Release 0.1
1.6.8 Troubleshooting
Why is my copy of DXR acting erratic, failing at searches, making requests for JS templates that shouldn’t exist, and just genera
Did you run python setup.py install for DXR at some point? Never, ever do that in development;
use python setup.py develop instead. Otherwise, you will end up with various files copied into your
virtualenv, and your edits to the originals will have no effect.
20
Chapter 1. Contents
CHAPTER 2
Back Matter
2.1 Glossary
filtered term A query term consisting of an explicit filter name and an argument, like regexp:hi|hello or
callers:frob
index A folder containing one or more source trees indexed for search and prepared to serve as HTML. Indices are
created by the dxr-index.py command.
instance See index.
term A space-delimited part of a query
text term A query term without an explicit filter name, interpreted as raw text for a substring search
2.2 Icon Credits
DXR uses third-party icons from a variety of sources.
If you add an icon, please document its origin in this document. Feel free to use existing icons, but keep in mind that
they use semantic naming. So don’t use the search icon for zoom, as we may later change the search icon from a
magnifying glass to, for example, binoculars.
2.2.1 From Silk
Following icons originates from Silk by Mark James, licensed under Creative Commons Attribution 2.5 License.
• folder
• path_search
• exclude_path
• goto_folder
• page_white_find
• page_white_code
• page_white
• page_white_wrench
• buglink
21
DXR Documentation, Release 0.1
• external_link
• mimetypes/php
• mimetypes/c
• mimetypes/build
• mimetypes/sh
• mimetypes/cs
• mimetypes/h
• mimetypes/css
• mimetypes/js
• mimetypes/rb
• mimetypes/txt
• mimetypes/cpp
• mimetypes/xml
• mimetypes/unknown
• mimetypes/ui
• mimetypes/conf
• mimetypes/java
• mimetypes/svg
• mimetypes/html
• mimetypes/iso
• mimetypes/vs
• mimetypes/py (mixed with official python logo)
• mimetypes/mm (Remixed by DXR developers)
2.2.2 From FatCow Hosting
Following icons originates from FatCow by FatCow hosting, licensed under Creative Commons Attribution 3.0 License.
2.2.3 From Fugue
Following icons originates from Fugue by Yusuke Kamiyamane, licensed under Creative Commons Attribution 3.0
License.
• raw
• warning
• log
• blame
• diff
22
Chapter 2. Back Matter
DXR Documentation, Release 0.1
• search_warning
• regexp-search
• mimetypes/diff
• mimetypes/tex
2.2.4 From SharpDevelop
Following icons originates from SharpDevelop a mix of (partially) derivative works of Yusuke Kamiyamane, modified
by the SharpDevelop team and independent works by the SharpDevelop team all licensed under GNU LGPL.
• jump
• method
• reference
• type
• field
• macro
• members
• struct
• union
• class
• enum
2.2.5 From Tango Project
Following icons originates from Tango by the Tango desktop project, released into public domain.
• search
• Glossary
• genindex
• modindex
• search
• Icon Credits
2.2. Icon Credits
23
DXR Documentation, Release 0.1
24
Chapter 2. Back Matter
Index
D
DXR_FOLDER, 10
E
environment variable
DXR_FOLDER, 10
LD_LIBRARY_PATH, 10, 18
F
filtered term, 21
I
index, 21
instance, 21
L
LD_LIBRARY_PATH, 10, 18
T
term, 21
text term, 21
25