Arches is an open-source data management platform originally developed for the cultural heritage field by the Getty Conservation Institute and World Monuments Fund. Due to the complex and varied nature of cultural heritage data, and to promote interoperability and sustainable data practices, the Arches Platform has been developed as a standards-based, comprehensive and flexible platform that supports a wide array of uses.
Welcome to the Arches official documentation site!#
This documentation primarily aims to provide guidance with Arches installation, technical administration, management, localization, customization and other extensions. Because Arches sees continual improvement, please help make this documentation provides clear, accurate, and up-to-date information by filing tickets that identify issues for improvement here on GitHub.
The documentation is organized into the following sections. It is recommended to start with the Getting Started and Installation section if you are new to Arches.
Arches is an enterprise-level system developed to improve data management in support of effective heritage conservation and management. Because Arches has grown in power and flexibility it serves a wide range of needs in the cultural heritage sector and beyond. This section provides an overview of the Arches software release process and different versions of Arches.
Arches is a web-based, geospatial information system for cultural heritage inventory and management. The platform is purpose-built for the international cultural heritage field, and it is designed to record all types of immovable heritage, including archaeological sites, buildings and other historic structures, landscapes, and heritage ensembles or districts.
Arches allows administrators to create their own database schema, and manage their own thesauri, while end users can search, explore and download the resources directly. In this way Arches is not only a robust and easy to use inventory system, it is also a perfect way to publish and disseminate your organization’s cultural heritage information.
Arches is a web framework built on Django and is designed to make it easier to build applications that need:
Geospatial data management and geoprocessing like a GIS (Geograhic Information System) offers, but with a much more flexible approach for modeling the geometries associated with a resource.
the ability to import arbitrary data schema in the form of graphs as a means of defining the set of attributes that describe data resources
an Ontology as a means of formally naming and defining data types, properties, and the relationships between the data entities that describe a resource.
Thesauri to manage the controlled vocabularies needed to describe and index information in a consistent and uniform way.
Arches manages data “resources”. Resources can represent almost anything you want: physical things (such as a cultural heritage object), temporal things (such as activities or events), actors (such as a person or organization), or conceptual objects (such as an image. document, or other information carrier).
Resources are defined as directed graphs (nodes connected by edges). Nodes in the graph are used to represent the attributes (or collection of attributes) of a resource and edges define the type of relationship between attributes. In practice, a resource graph in Arches functions much like a schema does in a relational database.
Arches provides core services for creating, reading, updating, and deleting resources. Because resources are defined as graphs, Arches provides the services needed to import and parse resource graphs, as well the ability to create and interact with instance graphs (e.g. an instance of a resource graph).
To promote consistent data creation, update, and indexing workflows, Arches implements a Reference Data Manager (RDM) that can manage thesauri. The RDM allows users with the appropriate privileges to update thesaurus entries in a manner compliant with SKOS (http://www.w3.org/2004/02/skos/) and assign the concepts within a thesaurus with data entry forms.
The Arches project uses semantic versioning to describe unique states of the software. Arches was initially released in October 2013 as version 1.0. Since then, Arches has had 7 major releases and many more minor releases and patch releases (see Arches Release Process). For more details about the capabilities introduced in past versions and capabilities planned for future versions, please review: https://www.archesproject.org/roadmap/
Arches is primarily intended for software developers who need to build flexible web applications and wish to hide the complexities of ontologies, thesauri, and geospatial data management from their users.
This is the official documentation for Arches. It should provide you with background information on Arches, how to install it, and a good overview of its capabilities. While you are using Arches, be aware that much of the content here is also available by clicking the “?” symbol in the top-right corner of any page.
Improve Our Documentation! If you find errors, have suggestions, or want to make a contribution, these docs are managed in the archesproject/arches-docs repo.
Arches is open source software, which means that with your help it will continue to evolve and improve.
Bug Reports and Code Contribution If you find issues with the Arches interface or code, or have the means to contribute code to fix existing issues, please begin by reading our guidelines for Contributing to Arches.
Translations We are always hoping to bring Arches to new audiences around the world. Please post on the Arches Forum if you are interested in contributing a translation.
Feature releases will introduce significant, new features to Arches and will be announced approximately every 6 months.
Feature releases may contain schema or API changes that may not be compatible with the previous feature release.
Each feature release will be incremented with the pattern a.b, where a represents the major release and b represents the feature (aka minor) release.
Each feature release will be placed in its own branch in git, named with its release number followed by an x representing the latest patch release. (e.g. stable/a.b.x).
Following each feature release we will resolve bugs, performance, and security issues in the most recent feature release with patch releases.
A new patch release, if needed, will be announced every 1 to 3 months and will not include breaking changes with the previous patch release. Therefore, we encourage users to stay up-to-date with these releases.
Patch releases will be incremented as such: a.a.b, a.a.c… with a representing the feature release and b and c representing patch (aka micro) releases. In Git each patch release will identified in its feature release branch with a tag.
This section of the documentation provides guidance on how to install Arches. As an “enterprise-level” system, Arches is designed for deployment in organizational contexts with both needs and capabilities beyond those typical of an individual person. While Arches can be installed and tested on a personal computer, it is designed for deployment on servers in a networked environment.
Arches can be deployed for testing on a personal computer provided one has administrative permissions, some comfort and familiarity with command line interfaces, and (typically) some patience with trouble shooting. “Production” deployments (either public or private to an organizational setting) requires some experience with Web hosting, IT systems administration and, if using a cloud service provider, cloud computing infrastructure. These skills are needed to install, configure, and (critically) maintain Arches.
Arches works on Linux, Windows, or macOS. Most production implementations use Linux servers.
To begin development or make a test installation of Arches, you will need the following minimum resources:
Disk Space:
2GB for all dependencies and Arches.
8GB to store uploaded files, database backups, etc.
Depending on how many uploaded files (images, 3d models, etc) you will have, you may need much more disk space. We advise an early evaulation of how much space you think you’ll need, and then provision twice as much just to be safe…
Memory (RAM):
4GB
This recommendation is based on the fact that ElasticSearch requires 2GB to run, and as per official ElasticSearch documentation no more than half of your system’s memory should be dedicated to ElasticSearch.
In production, you very likely need to increase your memory. In building the production (minified) frontend asset bundle, yarn (all by itself!) will require at least 8GB to run. If you don’t have enough memory, yarn will likely return an error, sometimes after several minutes or hours of processing. In production, you may also find it useful to allow ElasticSearch to use up to 32GB.
Windows Use the EnterpriseDB installers, and use Stack Builder (included) to get PostGIS. After installation, add the following to your system’s PATH environment variable: C:\ProgramFiles\PostgreSQL\12\bin. Make sure you write down the password that you assign to the postgres user.
Elasticsearch is integral to Arches and can be installed and configured many ways.
For more information, see Arches and Elasticsearch.
GDAL >= 2.2.x:
Windows Use the OSGeo4W installer, and choose to install the GDAL package (you don’t need QGIS or GRASS). After installation, add C:\OSGeo4W64\bin to your system’s PATH environment variable.
Node.js 16.x (recommended):
Installation: https://nodejs.org/ (choose the installer appropriate to your operating system).
NOTE: Arches may not be compatible with later versions of Node.js (after 16) (see discussion).
NOTE: We are pointing to the “classic” yarn installer to avoid installation of more recent versions of yarn that are not compatible with Arches via the Node.js package manager.
To support long-running task management, like large user downloads, you must install a Celery broker like RabbitMQ or Redis:
For Ubuntu we maintain an ubuntu_setup.sh script to install dependencies. It works for 18.04 and 20.04, and preliminary testing shows it to be compatible with 22.04 as well.
You will be prompted before each dependency is installed, or use yes|source./ubuntu_setup.sh to install all components (Postgres/PostGIS, Node/Yarn, and ElasticSearch).
We have an in-progress Docker install, and would love help improving it. You can also review some works-in-progress and community-created approaches to using Docker Installation with Docker
You need a Python 3.10+ virtual environment. Skip ahead if you have already created and activated one. Otherwise, use the commands below for a quick start.
Create a virtual environment:
python3-mvenvENV
This will generate a new directory called ENV.
Note
On some linux distributions, if the python version is less than 3.10, entering the following command may yield an error but it should alert you to any dependencies you may need to install, after which you’ll be able to run this command.
Activate the virtual environment
The following are relative paths to an activate script within ENV.
Linux and macOS:
sourceENV/bin/activate
Windows:
ENV\Scripts\activate.bat
Note
After you activate your virtual environment, your command prompt will be prefixed with (ENV). From here on the documentation will assume you have your virtual environment activated. Run deactivate if you need to deactivate the virtual environment.
Test the Python version in ENV:
python
This will run the Python interpreter and tell you what version is in use. If you don’t
see at least 3.10, check your original Python installation, delete the entire ENV
directory, and create a new virtual environment. Use exit() or ctrl+C to
leave the interpreter.
Upgrade pip
A recommended step, though not always strictly necessary:
A Project holds branding and customizations that make one installation of Arches different from the next. The name of your project must be lowercase and use underscores instead of spaces or hyphens. The example below uses my_project.
You may be prompted to enter a password for the postgres user. Generally, our installation scripts set this password to postgis, however you may have set a different password during your own Postgres/PostGIS installation.
In your current terminal, run the Django development server (with the Arches virtual environment activated):
pythonmanage.pyrunserver
Then, in a second terminal, activate the virtual environment used by Arches (this is a required step). Then navigate to the root directory of the project. ( you should be on the same level as package.json) and build a frontend asset bundle:
yarnbuild_development creates a static frontend asset bundle. Any changes made to frontend files (eg. .js) will not be viewable until the asset bundle is rebuilt. run yarnbuild_development again to update the asset bundle, or run yarnstart to run an asset bundler server that will detect changes to frontend files and rebuild the bundle appropriately.
The first thing everyone wants to do is look at the map, so let’s set this up first.
Go to Mapbox.com and create a free account.
Find your default API key (starts with pk.) and copy it.
Now go to localhost:8000/settings.
Login with the default credentials: username: adminpassword: admin
Find the Default Map Settings, and enter your Mapbox API Key there.
Feel free to use the ? in the top-right corner of the page to learn about all of the other settings, and change any that you like (heed warning below).
Save the settings.
Navigate to localhost:8000/search to make sure the basemap appears.
Note
We recommend exporting these settings by running pythonmanage.pypackages-osave_system_settings.
This will create a JSON file in your project, which will be used if you ever need
to setup your database again.
Warning
If you create a new Project Extent, you should also update the Search Results Grid settings,
otherwise you could get a JSON error in the search page. To be on the safe side, choose
a high Hexagon Size combined with a low Hexagon Grid Precision.
An Arches “package” is an external container for database definitions (graphs, concept schemes),
custom extensions (including functions, widgets, datatypes) and even data (resources).
Packages are installed into projects, and can be used to share schema between installations.
Getting a connection error like this (in the dev server output or in the browser)
Error
ConnectionError: ConnectionError(<urllib3.connection.HTTPConnection object at 0x0000000005C6BC50>: Failed to establish a new connection: [Errno 10061] No connection could be made because the target machine actively refused it) caused by: NewConnectionError(<urllib3.connection.HTTPConnection object at 0x0000000005C6BC50>: Failed to establish a new connection: [Errno 10061] No connection could be made because the target machine actively refused it)
means Arches is not able to communicate with ElasticSearch. Most likely, ElasticSearch is just not running, so just start it up and reload the page. If you can confirm that it is running, make sure Arches is pointed to to correct port.
Postgres password authentication error
Error
django.db.utils.OperationalError: FATAL: pw authentification failed for user postgres
Most likely you have not correctly set the database credentials in your settings.py file. Many of our install scripts set the db user to postgres and password to postgis, so that’s what Arches looks for by default. However, if you have changed these values (particularly if you are on Windows and had to enter a password during the Postgres/PostGIS installation process), the new values must be reflected in in settings.py or settings_local.py.
Note
On Windows, you can avoid having to repeatedly enter the password while running commands in the console by setting the PGPASSWORD environment variable: setPGPASSWORD=<yourpassword>.
Building the frontend assets can sometimes be a source of challenge and frustration. Sometimes a “locked down” computer (with strict security configurations) may cause some trouble. If this is the case, you can try the following steps to interate toward a successful build.
Edit your .yarnrc file to disable strict SSL.
To do so, navigate to your project’s root directory and open the .yarnrc file in a text editor. Add the following lines to the end of the file:
.. code-block:: bash
cafile null
strict-ssl false
After the above edits, save the file.
Remove the node_modules folder and yarn.lock file if they exist:
If you’re using a virtual environment, activate it. ENV should be replaced with the name of your virtual environment.
sourceENV/bin/activate
Run your Arches Django server and leave it running.
pythonmanage.pyrunserver
Open a *new terminal* to complete the following steps below.
If you’re using a virtual environment, activate it as in step 4 above. ENV should be replaced with the name of your virtual environment.
sourceENV/bin/activate
Navigate to the same directory as package.json, and install the frontend dependencies:
cdpath/to/dir/my_project/my_project
yarninstall
Once the dependencies are installed, build your static asset bundle:
yarnbuild_development
If successful, you should see a message indicating that the build was successful. A successful build should make a message looking something like this:
Docker is a platform that allows you to package, distribute, and run applications in containers. By using Docker, you can install Arches on any system that supports Docker. This helps insulate you from worries about how the operating system or other specifics of your host system impact dependencies and configurations. Some of the benefits of using Docker include:
Isolation:
Docker allows developers to isolate applications from the underlying infrastructure,
reducing the risk of conflicts and making it easier to manage dependencies.
Scalability:
Docker makes it easy to scale applications up or down, as containers can be started
and stopped quickly and easily.
Portability:
Containers allow a developer to package up an application with all of the parts
it needs, such as libraries and other dependencies, and ship it all out as one package. By doing so, the developer can be assured that the application will run on any other Linux machine regardless of any customized settings that machine might have that could differ from the machine used for writing and testing the code.
This document will guide you through the process of using Docker to install Arches on your system. Even if you do not want to use Docker to deploy Arches, the review of Arches Docker setups can still provide a useful guide to see how various dependencies and configurations fit together.
Important
Arches currently lacks an “official” approach to installation using Docker. The examples that we discus here are drawn from various works-in-progress and community-created approaches. They should be helpful to get started with Docker and Arches, but they are not fully tested for production deployments.
Docker Compose is a tool for defining and running multi-container Docker applications. It must also be installed on your system. You can download Docker Compose from the official website: https://docs.docker.com/compose/install/
The following lists Docker repositories that set up Docker containers running Arches configured for testing and development, not a production deployment.
Arches Dependencies
This repo (archesproject/arches-dependency-containers) uses Docker to provision the dependency PostgreSQL, ElasticSearch, and RabbitMQ services required by Arches. NOTE: This repo does not install Arches itself, it just provides an alternate means to install dependency services.
Arches for Science
This repo (archesproject/arches-for-science-prj) uses Docker to deploy an instance of Arches running the package and extensions for the Arches for Science project. This Docker deployment is designed to run in conjunction with the Docker containers and Docker network started by the Arches Dependency repo discussed above. The Arches for Science project aims to support workflows in the scientific conservation of objects (especially in museum collections), see: https://www.archesproject.org/arches-for-science/
Arches HER (Historic England)
This repo (archesproject/arches-her) uses Docker to deploy an instance of Arches running the package and extensions for the Arches implementation developed for Historic England (https://www.archesproject.org/arches-for-hers/). This Docker deployment is designed to run in conjunction with the Docker containers and Docker network started by the Arches Dependency repo discussed above.
The Arches for Science Docker repository and the Arches HER Docker repository both launch instances of Arches that depend upon Docker containers and the Docker network started by the Arches Dependencies repository. These Docker repositories can be used to “spin up” different versions of Arches and Arches dependencies. You need to launch the appropriate set of dependencies started with Arches Dependencies with the version of Arches you are starting in Arches for Science or Arches HER. In order to switch between versions of Arches in Arches for Science and Arches HER, use git to checkout a branch with the desired version of Arches. For example, to run Arches for Science using Arches 7.4, use the dev/7.4.x branch in the repo: gitcheckoutdev/7.4.x
The following lists Docker repositories that set up Docker containers running Arches configured for production deployment. These repositories are not officially part of the Arches Project, and may not have received the same level of review and vetting as Arches Project repositories. While these are not yet fully vetted, they can be a useful starting point or guide to use Docker for production deployment of Arches:
arches-via-docker
This repo (opencontext/arches-via-docker) uses Docker to provision containers running Arches (the most current stable version) and containers for the dependency PostgreSQL, ElasticSearch, and Redis services. It also starts an Nginx (as a proxy server) container as well as other containers to obtain and update SSL (for secure HTTPS) encryption certificates. You can use this repo directly or use it as a guide to see how to Arches can be configured for “production” deployments.
Arches Projects facilitate all of the customizations that you will need to make one installation of Arches different from the next. You can update HTML or CSS to modify web page branding, and add functions, datatypes, and widgets to introduce new functionality. A project sits outside of your virtual environment, and can thus be transferred to any other system where Arches is installed.
Many project-specific settings are defined here. You should use settings_local.py to store variables that you may want to keep out of the public eye (db passwords, API keys, etc.).
These directories will store the custom extensions that you can create for the project. Developers interested in pursuing these customizations should start with Creating Extensions.
A package is an external collection of Arches data (resource models, business data, concepts, collections) and customization files (widgets, datatypes, functions, system settings) that you can load into an Arches project.
true to run setup_db to rebuild your database. default = ‘false’
-ow
overwrite to overwrite concepts and collections. default = ‘ignore’
-st
stage to stage concepts and collections. default = ‘stage’
-s
a path to a zipfile located on github or locally
-o
operation name
-y
accept defaults (will overwrite existing branches and system settings with those in the package)
-bulk
uses bulk_save methods which run faster but don’t call an object’s regular save method
-dev
loads three test users
If you do not pass the -db True to the load_package command, your database will not be recreated. If you already have resource models and branches with the same id as those you are importing, you will be prompted to confirm whether you would like to keep or overwrite each model or branch.
If you pass the -bulk argument, know that any resource instances that rely on functions to dynamically create/edit tiles will not be called during package load. Additionally, some logging statements may not print to console during import of reference data. Whereas the default save methods create an edit in the edit history for each individual tile created, -bulk will instead create a single edit for all tiles, of type: “bulk_create”. Resource creation will still be individually saved to edit history.
Note
It is important to note that you cannot load a package directly into core Arches. Packages must be loaded into a project.
If you are a developer running the latest arches you probably want to create a project with a new Arches installation. This ensures that the arches_project create command uses the latest project templates.
Uninstall arches from your virtualenv
pipuninstallarches
Navigate into arches root folder delete the build directory
Reinstall arches
pythonsetup.pyinstall
pythonsetup.pydevelop
Navigate to where you want to create your new project and run:
arches-projectcreatemynewproject
Note
You can use the option [{-d|--directory}<directory_name>] to change the directory your new project will be created in.
Finally run the load_package command using the project’s manage.py file.
If you want to create additional projects with the same data or share your data with others that need to create similar projects, you probably want to create a package.
The create_package command will help you get started by generating the folder structure of a new package and loading the resource models of your current project into your new package.
To create new package simply run the create_package command. The following example would create a package called mypackge.
full path to the package directory you would like to create
-o
operation name
Below is a list of directories created by the create_package command and a brief description of what belongs in each. Be sure not to place files that you do not want loaded into these directories. If, for example, you have draft business_data that is not ready for loading, just add a new directory and stage your files there. Directories other than what is listed below will be ignored by the loader.
business_data
Resource instance .csv and corresponding .mapping files, each sharing the same base name.
business_data/files
Files to be added to the uploaded files directory
business_data/relations
Resource relationship files (.relations)
business_data/resource_views
sql views of flattened resource models
extensions/function
Each function in this directory should have its own directory with a template (.htm), viewmodel (.js) and module (.py). Each file must share the same base name.
extensions/datatypes
Each datatype in this directory should have its own directory with a template (.htm), viewmodel (.js) and module (.py). Each file must share the same base name.
extensions/widgets
Each widget in this directory should have its own folder with a template (.htm), viewmodel (.js) and configuration file (.json). Each file must share the same base name.
graphs/branches
arches.json files representing branches
graphs/resource_models
arches.json files representing resource models
map_layers/mapbox_styles/overlays*
Each overlay should have a directory with a mapbox style as exported from mapbox including a style.json file, license.txt file and an icons directory
map_layers/mapbox_styles/basemaps*
Each basemap should have a directory with a mapbox style as exported from mapbox including a style.json file, license.txt file and an icons directory
map_layers/tile_server/overlays*
Each overlay should have a directory with a .vrt file and .xml to style and configure the layer. Each file must share the same base name.
map_layers/tile_server/basemaps*
Each overlay should have a directory with a .vrt file and .xml to style and configure the layer. Each file must share the same base name.
preliminary_sql
sql files containing database operations necessary for your project.
reference_data/concepts
SKOS concepts .xml files
reference_data/collections
SKOS collection .xml files
system_settings
The system settings file for your project
* map layer configuration
By default mapbox-style layers will be loaded with the name property found in the layer’s style.json file. The default name for tile server layers will be the basename of the layer’s xml file. For both mapbox-style and tile server layers the default icon-class will be fa fa-globe. To customize the name and icon-class, simply add a meta.json file to the layer’s directory with the following object:
{"name":"example name","icon":"fa example-class"}
It is not necessary to populate every directory with data. Only add those files that you would like to share.
Once you’ve added the necessary files to your package, simply compress it as a zip file or push it to a github repository and it’s ready to be loaded.
Two different files are used to define custom settings for your package.
package_settings.py
The django settings relevant to your project not managed in system settings. For example, you may want to include your time wheel configuration and your analysis SRID settings in this file so that users do not have add these settings manually to their own settings file after loading your package. This file is copied into your project when the package is loaded.
package_config.json
This file allows you to configure other parts of the data loading process. For example, the order in which the business data files are loaded. Contents of this file may look like
If you make changes to the resource models in your project you may want to update your package with those changes. You can do that with the update_package command:
full path to the package directory you would like to update
-o
operation name
-y
accept defaults (will overwrite existing resource models with those from your project)
Bear in mind that this command will not update a package directly on Github. It will however update a package in a local directory that you have cloned from an existing package on Github or created yourself with the create_package command.
This section provides information on how to configure and administer Arches, as well a brief discussion on how to create, edit, delete and search resources in Arches.
This section guides you through completing some additional configurations of Arches after you have successfully installed and launched (started) a running instance of Arches.
You may find that new resources are named Undefined in your search results. This is because the Resource Descriptor Function has not yet been configured for your Resource Model. Follow these steps to configure it separately for each Resource Model in your database.
Go to Arches Designer > Resource Models (/graph)
In the list of Resource Models, follow Manage∙∙∙ > Manage Functions
Select the Define Resource Descriptors function to add it to the Resource Model
Use the tabs to configure all three different descriptor templates.
To configure a descriptor, you must first choose what card in the Resource Model holds the data you want to display. Choose this card in the dropdown, and variables corresponding to each node in that card will be added to the template, demarcated with <>. Now you can rearrange these variables, delete some of them, and/or add text to customize the descriptor.
Example: Consider a Resource with a Name node value of Folsom School and NameType node value of Primary.
Template
Result
<Name>,<NameType>
Folsom School, Primary
BuildingName:<Name>
Building Name: Folsom School
Important
After you define your descriptors, you must Re-Index to update all of the existing resources in your database. This could take a while, if you have a lot of resources (that’s why it’s best to do this step right away!).
If there are multiple instances of a given card in a Resource, the first one added will be used to create these descriptors. To manually change this, edit the Resource in question and drag the desired tile to the top of the list.
Warning
Any user with read access permission to a resource will be seeing these resource descriptors wherever it shows up in search results or on the map. If a card is intended to be hidden from any group of users, it should not be used in this function.
This section covers configuration settings that are managed through a browser. You will likely need to update other settings that are defined elsewhere.
Arches uses the Mapbox mapping library for map display and data creation. Arches also supports Mapbox basemaps and other services.
Mapbox API Key (Optional) - By default, Arches uses some basemap web services from Mapbox. You will need to create a free API key (or “access token”) for these services to be activated. Alternatively, you could remove all of the default basemaps and add your own, non-Mapbox layers.
Mapbox Sprites - Path to Mapbox sprites (use default).
Mapbox Glyphs - Path to Mapbox glyphs (use default).
Project Extent
Draw a polygon representing your project’s extent. These bounds will serve as the default for the cache seed bounds, search result grid bounds, and map bounds in search, cards, and reports.
Map Zoom
You can define the zoom behavior of your maps by specifying max/min and default values. Zoom level 0 shows the whole world (and is the minimum zoom level). Most map services support a maximum of 20 or so zoom levels.
Search Results Grid
Arches aggregates search results and displays them as hexagons. You will need to set default parameters for the hexagon size and precision. Aggregating search results into a hexagonal grid can greatly improve performance of the map user-interface because fewer geometric features need to be delivered to a client’s browser.
Arches system settings for map search results grid#
The Arches Elasticsearch component indexes a geohash of location data that powers efficient aggregation of geographic locations for resource instances. However, to enable map display of search results aggregated in a hexagonal grid, you first need to add a map layer that has includes "source":"search-results-hex" in the layer definitions. You can read more about adding map layers (see Creating New Map Layers) or you can use SQL to insert a hex grid map layer as below:
A large project area combined with a small hexagon size and/or high precision will take a very long time to load, and can crash your browser. We suggest changing these settings in small increments to find the best combination for your project.
Application Name - Name of your Arches app, to be displayed in the browser title bar and elsewhere.
Default Data Import/Export Name - Name to associate with data that is imported into the system.
Web Analytics
If you have made a Google Analytics Key to track your app’s traffic, enter it here.
Thesaurus Service Providers
Advanced users may create more SPAQRL endpoints and register them here. These endpoints will be available in the RDM and allow you to import thesaurus entries from external sources.
Arches allows you save a search and present it as convenience for your users. Saved Searches appear as search options in the main Search page. Creating a Saved Search is a three-step process.
Specify Search Criteria - Go to the Search page and enter all the criteria you would like to use to configure your Saved Search. You may notice that with the addition of each new search filter (either by using the term filter, map filtering tools, or temporal filters) the URL for the page will change.
Copy the URL - In your browser address bar, copy the entire URL. This will be a long string that defines each of the search filters created in step 1.
Create the Saved Search - Finally, head back to this page and fill out the settings that you see at left. You can also upload an image that will be shown along with your Search Search.
Arches creates a Time Wheel based on the resources in your database, to allow for quick temporal visualization and queries. A few aspects of this temporal search are defined here.
Color Ramp - Currently unused (saved for future implementation). The color ramp for the time wheel. For further reference, check out the d3 API reference.
Time wheel configuration - Currently unused (saved for future implementation). You can, however, modify the time wheel configuration using the advanced settings, Time Wheel Configuration.
Because these settings are stored in the database, as opposed to a settings.py file, if you drop and recreate your database, you will lose them and need to re-enter them by hand. To avoid this, you should run this command after you have finished configuring settings through the UI:
A file named “System_Settings.json” will be saved to the directory indicated. If no directory is indicated the file will be saved to settings.SYSTEM_SETTINGS_LOCAL_PATH, which is my_project/my_project/system_settings/ by default. This same path is used to import settings when a new package is loaded into your project.
The first item of business when preparing your production of Arches is to change the Admin user’s password. You cannot change the Admin user’s password in the Arches UI because the Admin account is not associated with an email. Instead you’ll need to use the Django admin page:
Login as admin to Arches or in the Django admin (http://localhost:8000/admin/)
Navigate to the Django admin user page http://localhost:8000/admin/auth/user/.
In the upper right of the page select CHANGEPASSWORD and follow the steps to update the password.
In reality, many more settings are used than are exposed in the UI. To see all settings look in the core Arches settings.py file (we try to leave comments on each one). The way these settings are cascaded through the app, and where they can be overwritten as needed, is described below.
Settings here define backend information specific to the package loaded to your app. You do not need to create or modify this file as it will be loaded when you load a package. However, you may want to edit this file if your intent is to design or modify a package.
Typically kept out of version control, a settings_local.py file is used for 1) sensitive information like db credentials or keys and 2) environment-specific settings, like paths needed for production configuration.
↓ values here can be superceded by… ↓
System Settings Manager
Settings exposed to the UI are the end of the inheritance chain. In fact, these settings are stored as a resource in the database, and the contents of this resource is defined in the System Settings Graph. Nodes in this graph with a name that matches a previously defined setting (i.e. in the files above) will override that value with whatever has been entered through the UI.
If you’re a developer, you’ll notice that the codebase uses:
This is to ensure that UI settings are implemented properly. If you are using settings outside of a UI context you will need to follow the import statement with settings.update_from_db().
By default, Arches requires that passwords meet the following criteria:
Have at least one numeric and one alphabetic character
Contain at least one special character
Have a minimum length of 9 characters
Have at least one upper and one lower case character
Admins can change these requirements by configuring the AUTH_PASSWORD_VALIDATORS setting in their projects settings_local.py file. Below is the default validator setting:
AUTH_PASSWORD_VALIDATORS=[{'NAME':'arches.app.utils.password_validation.NumericPasswordValidator',#Passwords cannot be entirely numeric},{'NAME':'arches.app.utils.password_validation.SpecialCharacterValidator',#Passwords must contain special characters'OPTIONS':{'special_characters':('!','@','#',')','(','*','&','^','%','$'),}},{'NAME':'arches.app.utils.password_validation.HasNumericCharacterValidator',#Passwords must contain 1 or more numbers},{'NAME':'arches.app.utils.password_validation.HasUpperAndLowerCaseValidator',#Passwords must contain upper and lower characters},{'NAME':'arches.app.utils.password_validation.MinLengthValidator',#Passwords must meet minimum length requirement'OPTIONS':{'min_length':9,}},]
To remove a password validator in Arches, you can simply remove a validator from the list of AUTH_PASSWORD_VALIDATORS.
To modify the list of required special characters, simply edit the list of characters in the special_characters option in the SpecialCharacterValidator validator.
To change the minimum length of a password, change the min_length property in the MinLengthValidator validator.
By default Arches will bin your data in the search page time wheel based on your data’s temporal distribution. This enables Arches to bin your data efficiently. If your data spans over 1000 years, the bins will be by millennium, half-millennium and century. If your data spans less than a thousand years, your data will be binned by millennium, century, and decade.
You may decide, however, that the bins do not reflect your data very well, and in that case you can manually define your time wheel configuration by editing the TIMEWHEEL_DATE_TIERS setting.
Each tier, (‘Millennium’, ‘Century’, ‘Decade’ are each tiers) will be reflected as ring in the time wheel.
Properties:
“name” - The name that will appear in the description of the selected period
“interval” - The number of years in each bin. For example, if your data spans 3000 years, and your interval is 1000, you will get three bins in that tier.
“root” - This applies only to the root of the config and should not be modified.
“child” - Adding a child will add an additional tier to your time wheel. You can nest as deeply as you like, but the higher the resolution of your time wheel, the longer it will take to generate the wheel.
“range” - A range is optional, but including one will restrict the bins to only those within the range.
If you do need to represent decades or years in your time wheel and this impacts performance, you can cache the time wheel for users that may load the search page frequently. To do so, you just need to activate caching for your project.
If you have Memcached running at the following location 127.0.0.1:11211 then the time wheel will automatically be cached for the ‘anonymous’ user. If not you can update the CACHES setting of your project:
This will cache the time wheel to your project’s directory. There are other ways to define your cache that you may want to use. You can read more about those options in Django’s cache documentation.
By default the time wheel will only be cached for ‘anonymous’ user for 24 hours. To add other users or to change the cache duration, you will need to modify this setting:
`CACHE_BY_USER = {'anonymous': 3600 * 24}`
The CACHE_BY_USER keys are user names and their corresponding value is the duration (in seconds) of the cache for that user.
For example, if I wanted to cache the time wheel for the admin user for 5 minutes, I would change the CACHE_BY_USER setting to:
Setting up your captcha will help protect your production from spam and other unwanted bots. To set up your production with captcha, first register your captcha and then add the captcha keys to your project’s settings.py. Do this by adding the following:
Update the EMAIL_HOST_USER and EMAIL_HOST_PASSWORD with the correct email credentials and save the file. It is possible that this may not be enough to support your production of Arches. In that case, there’s more information on setting up an email backend on the Django site.
To configure what group new users are put into, add the following lines of code to your project’s settings.py:
# group to assign users who self sign up via the web uiUSER_SIGNUP_GROUP='Crowdsource Editor'
If you would like to change which group new users are added to, replace ‘Crowdsource Editor’ with the group you would like to use.
Using Single Sign-On With an External OAuth Provider#
To take advantage of single sign-on using an organiztion’s identity provider, users can be routed through an external OAuth provider for authentication based on their email’s domain.
Your arches application will need to use SSL and be configured with an application ID from your provider. This application ID will need to be configured with a redirect URL to your Arches application at auth/eoauth_cb, for example: https://qa.archesproject.org/auth/eoauth_cb
Once your application is set up with the provider, you can configure Arches to use it by updating EXTERNAL_OAUTH_CONFIGURATION, for example using an Azure AD tenant could look something like this:
EXTERNAL_OAUTH_CONFIGURATION={# these groups will be assigned to OAuth authenticated users on their first login"default_user_groups":["Resource Editor"],# users who enter an email address with one of these domains will be authenticated through external OAuth"user_domains":["archesproject.org"],# claim to be used to assign arches username from"uid_claim":"preferred_username",# application ID and secret assigned to your arches application"app_id":"my_app_id","app_secret":"my_app_secret",# provider scopes must at least give Arches access to openid, email and profile"scopes":["User.Read","email","profile","openid","offline_access"],# authorization, token and jwks URIs must be configured for your provider"authorization_endpoint":"https://login.microsoftonline.com/my_tenant_id/oauth2/v2.0/authorize","token_endpoint":"https://login.microsoftonline.com/my_tenant_id/oauth2/v2.0/token","jwks_uri":"https://login.microsoftonline.com/my_tenant_id/discovery/v2.0/keys"# enforces token validation on authentication, AVOID setting this to False"validate_id_token":True,}
As of version 7.5, Arches can be configured to meet WCAG defined AA level accessibility requirements for all public facing user interfaces (all content available to anonymous users without a login, including the home page, search interface, and resource reports). Improved accessibility helps to promote a more welcoming and inclusive community, and may help to meet important legal and ethical requirements, especially for institutions that serve the public.
To enable the “Accessibility Mode”, update your Arches project settings.py or settings_local.py file and add:
Once you’ve saved that change, restart Arches. Arches should now display more accessible user interfaces for public facing content. The specific accessibility enhancements activated in Accessibility Mode include:
Markup to support labeling for screen readers
Tabbing to support natural flow through the site
Improved focus management especially when interacting with popup/slide out panels
Updated drop downs to use an accessible version
Updated text contrast
Updated html to reflow properly on smaller screen sizes or when the screen is zoomed up to 400%
Let’s begin with a brief primer on some of the core concepts upon which Arches is constructed.
Resources - Resources are what we call database records. If you are using Arches to create an inventory of historic buildings, each one of those buildings will be recorded as a “resource”. This terminology is used throughout the app.
Resource Models - When creating new Resources, a data entry user must decide which Resource Model to use, determining what information is collected for the Resource. Think of different Resource Models as categories of records in your database – “Buildings” vs. “Archaeological Sites” vs. “Cemeteries”, for example. Every Arches database must have at least one Resource Model.
Branches - Branches are tools for transport of complex node structures from one Resource Model to another. This allows you to avoid manually recreating the same “branches” in multiple Resource Models.
Note
Both Resource Models and Branches are sometimes referred to generically as “graphs”. This is because their underlying architecture is a graph. However, as you’ll see, they play completely different roles in Arches.
Important
The Arches Designer is used for altering the record-keeping structure of your database; it does not alter the physical Data Model.
Warning
If you need to have multiple versions of the same graph, perhaps multiple people are designing it or you need to retain earlier iterations while continuing to add nodes, you must Clone the graph. If a graph is renamed, exported, and imported, it will still overwrite the original, because the unique ID will remain unchanged.
The Arches Designer is where you export, import, duplicate, modify, and create your Resource Models and Branches. Any user who is part of the Graph Editor group will have access to the Arches Designer.
If you don’t see any Resource Models listed in your Arches Designer, you may want to consider loading a package. Alternatively, you can directly import individual Resource Model files through Add ….
To edit a Resource Model, click on it or click Manage … > Manage Graph and you’ll be brought to the Graph Designer.
Almost all aspects of Resource Model and Branch design are handled in the Graph Designer. The exception is Functions, which are handled in the separate Function Manager.
The Graph Designer comprises three tabs, the Graph Tab, Cards Tab, and Permissions Tab. Each tab is used to configure a different aspect of the Resource Model: In the Graph Tab you design the node structure, in the Cards Tab you configure the user interface (card) for each nodegroup, and in the Permissions Tab you are able to assign detailed permission levels to each card. The general workflow for using the Graph Designer is to proceed through the tabs in that same order.
The Graph Tab is where you build the actual graph, a structured set of nodes and nodegroups, which is the core of a Resource Model or Branch. As noted above, sometimes Resource Models and Branches are generically referred to as “graphs”, and this may seem confusing at first, but you’ll come to see that it is an appropriate nickname.
Screenshot of the Graph Tab in the Graph Designer, showing an “Actor” Resource Model.#
In practice, constructing the graph means adding nodes (or existing Branches) to the Graph Tree, which appears on the left side of the page when the Graph Tab is activated. When you add a new node, you set many different settings for that node, like datatype, in the main panel of the page.
During the graph construction process, you are able to create a new Branch from any portion of your graph. This is useful if you have completed a large section of the graph, and want to reuse it later in another Resource Model.
Note
If you are building a graph that uses an ontology, the ontology rules will automatically be enforced during this graph construction process.
Along the way, you can use the preview button to display the graph in a more graph-like manner. This view will be familiar to users of Arches going back to version 3.0.
Screenshot of the Graph Tab in the Graph Designer, showing the graph in preview mode.#
Nodes in Arches must be configured with a “Data Type”, and different datatypes store different kinds of information. For example, a string datatype is what you should use to store arbitrary text, like the name or description of a resource. A brief description of all datatype options in core Arches follows. Developers and extend Arches by creating their own custom datatype.
semantic:
A semantic node does not store data. Semantic nodes are used where necessary to make symbolic connections between other nodes, generally in order to follow ontological rules. The top node of every graph is a semantic node.
string:
Stores a string of text. This could be something simple like a name, or more something elaborate like a descriptive paragraph with formatting and hyperlinks.
number:
Stores a number.
file-list:
Stores one or mores files. Use this to upload images, documents, etc.
concept:
Stores one of a series of concepts from the Reference Data Manager. Users will choose a concept in a dropdown list or set of radio buttons. You’ll further be prompted to choose a Concept Collection—this controls which concepts the user is able to choose from.
concept-list:
Stores multiple concepts in a single node.
geojson-feature-collection:
Stores location information. Use this for a node that should be displayed as an overlay on the main search map.
domain-value:
Similar to “concept”, choose this to present the user with a dropdown list or set of radio buttons. Unlike “concept” this dropdown menu will not come from your system-wide controlled vocubulary, but from a list of values that you must define here.
domain-value-list:
Stores multiple domain-values in a single node.
date:
Stores a CE calendar date. See etdf for BCE and fuzzy date handling.
node-value:
Stores a reference to a different node in this graph. This would allow you to store duplicate data in more than one branch.
boolean:
Use this to store a “yes”/”no” or “true”/”false” value.
edtf:
Stores an Extended Date/Time Format value. Use this data type for BCE dates or dates with uncertainty. This datatype requires extra configuration to inform the database search methods how to interpret EDTF values. Data entry users can enter edtf dates using formats listed in the EDTF draft specification.
annotation:
Used to store an IIIF annotation.
url:
Stores a web address.
resource-instance:
Embeds a separate resource instance into this node. For example, you could add a node called “Assessed By” to a condition assessment branch, and use this data type. This would allow you to associate an individual stored in your database as an Actor resource with a specific condition assessment. Note that this construction is different from making a “resource-to-resource relationship”.
resource-instance-list:
Stores a list of resource instances in a single node.
Once you have added nodes to the graph, you can switch to the Cards Tab to begin refining the user interface. As you can see, the graph tree is replaced with a “card tree”, which is very similar to what users will see when they begin creating a resource using this Resource Model.
Screenshot of the Cards Tab in the Graph Designer, showing an “Actor” Resource Model.#
The top of the card tree is the root of the Resource Model, and you’ll select it to configure the public-facing resource report. Below this, you’ll see a list of cards in the Resource Model, some of which may be nested within others. There will be a card in the card tree for every nodegroup in the graph tree. Finally, within each card you’ll see one or more widgets. These correspond to nodes in the graph that collect business data. In the image above, the Appellation widget is selected.
When you select a card or a widget, you will see the Card Manager or Widget Manager appear on the right-hand side of the page. This is where you will update settings like labels, placeholder text, tooltips, etc. The middle of the page shows a preview of how a data entry user will experience the card.
Tip
While working with the Cards Tab, you may need to go back and change a node in the Graph Tab. Be aware that though you may expect node changes in the Graph Tab to cascade to widget configurations in the Cards Tab, this does not always happen. Be sure to double-check your work!
The UI of a card can be configured using a card component. Note that when you click a node in the card tree, the “Card Configuration” panel on the right-hand side of the screen will show the card component in a dropdown called “Card Type”.
Screenshot of the card manager user interface, highlighting “Card Type” dropdown in the top-right corner.#
The “CSS Classes” input box enables a user to enter space-separated class names (e.g. card-empty-classcard-incomplete-class) that correspond to class names defined by a developer in package.css.
While card components can be created from scratch, Arches (v5 on) comes with a few out of the box:
The Grouping Card groups multiple cards into a single user interface (UI). One card acts as the root of the group by changing its “Card Type” to “Grouping Card” and then assigning “sibling” cards to it (in the last field of the Card Configuration section). While arches makes it easy to edit an existing card to include other nodes, the grouping card might be useful for cases where resource instances already exist for a model thus preventing you from editing the cards but you still want to group different cards together.
The Map Card enables more customization for nodes of type geojson-collection. It has optional settings to start the map at a specific LatLng center and default zoom level. It can also import a particular map source layer of data into the UI. This might be useful if the user entering new geometry would benefit from having other resource data for reference in the map. To add a map source
or source_layer simply type its name (no quotes).
Screenshot of card configuration panel, highlighting the fields: “Select drawings map source” and “Select drawings map source layer”.#
The Related Resources Map Card enables a more rich user experience for nodes of type related-resource. Like the Map Card, map layer data representing resources can be added to a map UI such that the user can navigate geographically to select a related resource instead of paging through the dropdown list of relatable resources (however the dropdown still works normally in this card component). This card component is very useful if a user knows the geographic context of a resource (like what neighborhood it’s in) instead of its name. The steps to add such map data are the same as in the Map Card configuration panel.
Screenshot of a card using related resources map card, showing a selected resource in the map, polygon outlined in purple to show selection, and the resource instance’s name selected in the dropdown widget to the right of the map.#
Arches allows you to define permissions at the card level, so in the Permissions Tab you’ll see the card tree, just as in the Cards tab. However, you will only be able to select entire cards, not individual nodes.
Screenshot of the Permissions Tab in the Graph Designer, showing an “Actor” Resource Model.#
Once you have selected one or more cards, you can select a user or user group and then assign one of the following permissions levels:
Delete:
Allows users to delete instances of this nodegroup. Note, this is not the same as being allowed to delete an entire resource, permissions for which are not handled here.
No Access:
Disallows users from seeing or editing instances of this nodegroup. Use this permission level to hide sensitive data from non-authenticated users (the public).
Read:
Allows users to see this nodegroup’s card. If disallowed, the card/nodegroup will be hidden from the map and resource reports.
Create/Update:
Allows users to create or edit instances of this nodegroup. This provides the ability to let users edit some information about a resource, while be restricted from editing other information.
Arches data is modeled with graphs. A graph is a collection of nodes, structured like branches, all emanating from the
root node, which represents the resource itself. If you are modeling a building resource, you may have a root node
called “Building” with a node attached to it called “Name”. You can imagine that complex and thoroughly documented
resources will have many, many nodes.
An ontology is a set of rules that categorizes these nodes into classes, and dictates which classes can be connected to
each other. It’s a “rulebook” for graph construction.
For many Arches applications data modelers will want to use a CRM (Conceptual Reference Model).
The CIDOC CRM v6.2 is an ontology created by ICOM specifically to describe cultural heritage data. To learn more about
the CIDOC CRM, visit cidoc-crm.org or view a full list of classes and
properties.
Arches no longer comes preloaded with the CIDOC CRM, but it’s simple to load it or any other ontology. To load the CRM
just download or clone it from this repository: archesproject/cidoc-crm-ontology. download
If you are developing an Arches package, you can simply unzip the downloaded zip file, and add the cidoc_crm folder to
your packages ontologies directory. When you load your package, the CIDOC CRM will load with it:
/my_package/
└─ ontologies
└─ cidoc_crm
If you are not loading a package, you can unzip the downloaded file, and then run the following command with your
virtual environment activated:
If you have created your own ontology or have a different version of the CIDOC CRM, then just add
your files to a folder and include an ontology_config.json file which contains the metadata for your ontology. Here’s
and example:
When creating Resource Models and Branches, users have the option of enforcing an ontology throughout the graph, or
creating a graph with no ontology. If an ontology is chosen, the Graph Designer will enforce all of the applicable node
class (CRM Entities) and edge (CRM Properties) rules during use of the Graph Designer. Importantly, if a Resource Model
uses an ontology one can only add Branches to it that have been made with the same ontology.
Arches allows a great deal of customization for the layers on the search map. The contents of the following section will be useful when using the Map Layer Manager to customize your layers.
Resource Layers display the resource layers in your database. One Resource Layer is created for each node with a geospatial datatype (for example, geojson-feature-collection). You are able to customize the appearance and visibility of each Resource Layer in the following ways.
Styling
Define the way features will look on the map. The example map has demonstration features that give you a preview of the changes you make. You can choose to use Advanced Editing to create a more nuanced style. Note that changes made in Advanced Editing will not be reflected if you switch back to basic editing. For styling reference, checkout the MapBox Style Specification.
Clustering
Arches uses “clustering” to better display resources at low zoom levels (zoomed out). You are able to control the clustering settings for each resource layer individually.
Cluster Distance - distance (in pixels) within which resources will be clustered
Cluster Max Zoom - zoom level after which clustering will stop being used
Cluster Min Points - minimum number of points needed to create a cluster
Caching
Caching tiles will improve the speed of map rendering by storing tiles locally as they are creating. This eliminates the need for new tile generation when viewing a portion of the map that has already been viewed. However, caching is not a simple matter, and it is disabled by default. Caching is only advisable if you know what you are doing.
A Basemap will always be present in your map. Arches comes with a few default basemaps, but advanced users can configure and add more.
Overlays are the best way to incorporate map layers from external sources. On the search map, a user is able to activate as many overlays as desired simultaneously. Users can also change the transparency of overlays. New overlays can be added in the same manner as new basemaps.
Adding New Basemaps or Overlays
If you are a developer interested in creating new map layers (which could be new visualizations of resources or new basemaps and overlays), please see Creating New Map Layers.
Layer name - Enter a name to identify this basemap.
Default search map - For basemaps, you can designate one to be the default. For overlays, you can choose whether a layer appears on the in the search map by default. Note that in the search map itself you can change the order of overlays.
As of Arches version 7.4.0, you can assign different permissions to specific Arches users and groups. To manage such permissions, please review Map Layer Permissions.
The Arches Reference Data Management (RDM) tool is a core Arches
module which enables the creation and maintenance of controlled
vocabularies for use in dropdowns and controlled fields within the
various Arches Resource forms.
The use of the RDM is restricted to the Reference Data Manager, the
person responsible for maintaining the controlled vocabularies. It
allows for the creation, update, amendment and deletion of concept
schemes (controlled vocabularies). In addition the RDM enables you to
export your schemes as SKOS-Compliant XML files as well as the import
of external thesauri. For more information on SKOS see
http://www.w3.org/2004/02/skos/.
A concept scheme can be viewed as an aggregation of one or more
concepts and the semantic relationships (links) between those
concepts.
Each controlled vocabulary within the Arches RDM, whether it is a
simple wordlist or a polyhierarchical thesaurus, is defined as a
concept scheme. [More detail about concept schemes needed here]
It is possible to add multiple notes to a scheme. This allows the
reference data manager to add more information regarding the scheme
including the scope of what it covers, it’s definition, changes to the
history of the scheme, and how it should be used.
Select Add Note. The Add Concept Note pop-up will appear.
Enter the text for the new note in the ‘Note Editor’ field.
Click in the field marked ‘scopeNote’. The list of Note types
will appear.
Select the relevant Note type.
Note
Only one note of each type is allowed.
Select the language of the Note by clicking in the ‘Language’
field. This currently defaults to en-(US) (English)
Click the Save button. The new Note will appear in the
Notes panel.
Having created the new scheme you should now add the Top
Concepts. These will form the framework for the vocabulary and act as
the parents for more detailed concepts. This multi-level construction
is known as the hierarchy. In a simple wordlist there will by only one
level of concepts but in a complex thesaurus the hierarchy can be many
levels deep.
In the Right hand panel select Add Top Concept from the
Manage dropdown. The Add Concept pop-up will appear.
Enter the text for the label in the ‘Label’ field.
Enter the definition of the concept in the Note’ field. The
list of Note types will appear.
Select the language of the Note by clicking in the ‘Language’
field. This currently defaults to en-(US) (English)
Select hasTopConcept from the ‘Relation from Parent’ field.
Click the Save Changes button. The new concept will appear in
the Broader/Narrower Concepts panel.
It may be that other concept schemes similar to the one you are
developing may already exist. If this is the case it is possible to
import concepts along with their attributes from an external
source. By default the RDM can import concepts from the Getty Art and
Architecture Thesaurus
Select the concept, which will act as the parent for the new child
concept, by clicking on it. The Concept details panel will appear.
In the Right hand panel select Import Child from SPARQL from
the Manage dropdown. The Import Concept pop-up will appear.
Select Getty AAT from the list of Schemes available.
In the ‘Search for a concept’ field type the text of a concept,
eg. castle. A selection of concepts matching the text will appear.
Select the appropriate concept. The Concept Identifier field will
be populated with the URI of the concept.
Click the Import button. The new concept will appear in the
Broader/Narrower Concepts panel.
Click on the concept. The Concept details panel will appear and the
Notes panel will be populated with the external concept’s
scopeNote
Adding an additional Parent Concept (polyhierarchy)#
Some concepts may have more than one parent for example a castle is a
type of fortification but it is also a domestic building. This
situation where there are more than one possible parent concepts is
called polyhierarchy.
Select the concept, which you want to add a parent concept to, by
clicking on it.
Select Manage Parents from the Manage dropdown. The New
Parent Concept pop-up will appear.
In the ‘Search for a concept’ field type the text of the parent
concept you are going to add, eg. domestic buildings. A selection
of concepts matching the text will appear.
For any Concept the Broader/Narrower Concepts panel defaults to the
tree view and shows a concept’s immediate broader (parent) and
narrower (child) concepts. The scheme may also be browsed using the
graph interface.
In the Broader/Narrower Concepts panel click Show graph. The
graph view will appear centred on the concept you have chosen.
Navigate the graph by clicking on the ‘nodes’ (the
circles). Clicking on a node will bring up a dialog box with the
concept label and a ‘x’ symbol.
Click on the label to jump to the details for a concept
As part of a thesaurus it is possible to relate concepts which are not
hierarchically related but may be of interest to a user. This
‘Associative’ relationship can be made by relating one concept to many
others.
In the Right hand panel click on Add Related Concept in the
Related Concepts panel. The Manage Related Concepts pop-up will
appear.
Enter the text for the related concept in the ‘Select a
concept’ field. A selection of concepts matching the text will
appear.
Select the appropriate concept.
Click in the Relation type field. The Relation Type dropdown
will appear.
Select ‘Related’.
Click the Save button. The related term is added to the
concept.
Deleting a concept is simple in Arches but car should be taken that
the concept has not been used in any recording forms. If a concept has
been used a warning message will appear informing the Reference Data
Manager that all instances of the concept in use must be replaced with
an alternative concept before the concept can be deleted. If the
Reference Data Manager is certain the concept has not been used then
the concept may be deleted using either of the following methods.
Identify the concept’s parent concept and bring up its details.
In the Broader/Narrower Concepts panel make sure the tree view is
visible (This is the default view).
Click on the ‘x’ symbol next to the concept to be deleted. The
Delete a Concept pop-up will appear.
Note
A warning message ‘By deleting this concept, you will also be
deleting the following concepts as well. This operation cannot
be undone.’ will appear. If you do not want to delete the
concept click the No button.
Click the Yes button. The concept is deleted (along with any of
its children).
Identify the concept’s parent concept and bring up its details.
In the Broader/Narrower Concepts panel make sure the graph view
is visible by clicking on Show graph.
Click on the node for the concept to be deleted. A dialog box with
the concept label and a ‘x’ symbol will appear.
Click on the ‘x’ symbol next to the concept to be deleted. The
Delete a Concept pop-up will appear.
Note
A warning message ‘By deleting this concept, you will also be
deleting the following concepts as well. This operation cannot
be undone.’ will appear. If you do not want to delete the
concept click the No button.
Click the Yes button. The concept is deleted (along with any of
its children) and the node will disappear.
In the left hand panel select Import Scheme from the Tools
dropdown. The Import New Concept Scheme pop-up will appear.
Click the Choose File button. The Windows Explorer panel will
appear.
Navigate to the file to be uploaded.
Note
This file should be a SKOS file in any format parseable by
Python’s RDFLib.
Examples include RDF/XML, N3, NTriples, N-Quads, Turtle, TriX,
RDFa and Microdata.
Click Open. You will be returned to the Import New Concept
Scheme pop-up and the name of the file will have populated the form.
Click Upload File. The number of Concept Schemes will have
increased by 1 and the imported concept scheme will appear in the
ConceptSchemes panel.
In the left hand panel select Delete Scheme from the Tools
dropdown. The Delete Scheme pop-up will appear.
Note
A warning message stating ‘You won’t be able to undo this
operation! Are you sure you want to permanently delete this
entire scheme from Arches?’ will appear. If you do not want to
delete the scheme click the Close button.
Click in the Select Scheme to Delete field. The Concept
Scheme dropdown menu will appear.
Select the scheme to be deleted and click on it. The scheme name
will populate the field.
Click the Delete button.
Note
The warning message will appear again along with a list of all
of the concepts to be deleted. If you do not want to delete the
scheme click the Close button.
Click the Delete button to confirm deletion. The scheme is
deleted along with all its concepts.
Graph Design, Instance Relationships, and Concept Labels#
Arches v7.4.0 introduced features to enable administrators to define a wider range of relationship types between resources instances. Prior to v7.4.0, relationship types could only be defined in the resource instance widget using ontology properties. Arches v7.4.0 enables users to define relationships using concept values from concept collections managed in the Reference Data Manager (RDM) .
Steps to Make Custom Relationships between Resource Instances#
You will need administrative privileges to use the RDM and edit resource models and branches. If you have such permissions, the following steps enable customization of relationships between resource instances:
Create and define custom relationship concepts
In the RDM Thesauri tab, navigate to and then select the “Arches” > “Resource To Resource Relationship Types” concept. Under the blue “Manage” option button (right side of the screen), select “Add Child”.
Fill out the “Add Concept” form to describe your new custom resource to resource relationship type. If the direction of your custom relationship matters, you should also define an inverse relationship. For example, the inverse of “contains” can be “is contained by”.
Add custom relationship concept to dropdown entry
In the RDM Collections tab, navigate to and then select the “Resource To Resource Relationship Types” collection (it has the same name as the concept). Click the “Add dropdown entry” text, and this will open a dialogue where you can find and select your custom relationship concept to add to the “Resource To Resource Relationship Types” collection. This step makes your custom relationship available for use when you edit or create resource instances.
Use your custom resource to resource relationship concept in a branch
After you finish creating custom relationships (and their inverse relations), you can now use the Arches Designer to implement the custom relationships in your resource models. To use your custom resource to resource relationships, create a branch where the “Root Node Data Type” is either a “resource-instance” or a “resource-instance-list”. Once you select a resource model for use with this branch, click on the resource model label. This will open a form that will allow you to select a custom resource to resource relationship and the inverse of that relationship. After you save and publish, you will be able to use the custom resource to resource relationships as you create and edit resource instances.
Note
You can actually use any concept and collection you like (in other words, you are not restricted to the “Resource To Resource Relationship Types” concept). Our example use of the “Resource To Resource Relationship Types” concept and collection is a matter of convenience, and it probably makes sense for most users to use it unless their implementation really requires a more complex approach to defining relationships.
Arches is built with the Django framework Django Documentation and Arches makes use of the administrative user interface utilities that come standard with Django applications. The Django admin tools are intended for use by an organization’s trusted internal management team. It’s not intended for wider use by end users.
Arches administrators can use the the Django admin interface to control permissions for individual users and groups of users (see Managing Permissions) as well as certain site configurations and customizations (especially customizations for map interfaces).
Note
You can access the Django admin at localhost:8000/admin, the default admin credentials are admin/admin,
which must be changed in production. In production, the URL to the Django admin interface will be https://my-arches-site.org/admin/Any user with “staff” status can access the Django admin panel.
Once logged into the admin panel, you’ll see a page similar to this:
Arches site administration in Django admin panel.#
As of version 7.4, Arches provides Bulk Data Manager user interface tools for administrators to import and update large sets of data “in bulk”. These allow administrators to make changes across large sets of data, not just record by record.
The Bulk Data Manager is an Arches plugin (see Plugins). This plugin will be installed when you install Arches, but, by default, the Bulk Data Manager will be hidden.
To enable use of the Bulk Data Manager, login to the Django Admin User Interface and click the link to “Plugins” under models, click the “Bulk Data Manager”, and edit the JSON value for the attribute “Config”.
To enable use of the Bulk Data Manager the Config should be: {"show":true}. To disable use of the Bulk Data Manager, the Config should be: {"show":false}. Once you’ve made your change, press the “Save” button in the lower right.
The image below illustrates how to enable the Bulk Data Manager:
Enable the Bulk Data Manager via the Django Admin panel.#
Note
The Bulk Data Manager requires that you have properly installed and configured Task Management with Celery.
The Bulk Data Manager has several Import related features to support the configuration and ingest of tabular organized data into Arches. These features presume familiarity with both the core Arches Data Model and the specific resource models and branches (see Designing the Database) used in your instance.
The Bulk Data Manager import tools support imports of data stored in CSV and Excel files. The CSV and Excel importers require that data in tables (and in the case of Excel, worksheets) will be organized according to map properly to your resource models and node structures for these resource models. To assist in creating data properly structured for successful import, you can download an Excel workbook template for a given resource model. The animation below illustrates how to export a template for an example resource model.
Bulk Data Manager export of an Excel template for the (example) “Collection or Set” resource model#
To describe how to use the Bulk Data Manager to import data, we’ll refer to the Arches for Science project Collection or Set resource model as an illustrative example. In the Arches Designer, the card for the Name of Collection branch of the Collection or Set resource model looks like this:
Arches Designer view of the Name of Collection card used in the Collection or Set resource model#
If you used the Bulk Data Manager to download an Excel template file for this Collection or Set resource model, you would see worksheets for each branch used with the resource model. The Name of Collection branch of the Collection or Set resource model has shaded nodegroups and nodes that looks like this:
Excel template worksheet for Collection or Set resource model Name of Collection branch nodegroups and nodes.#
The Excel template file also includes a worksheet called “metadata”. The metadata worksheet describes the datatypes (see more: Core Arches Datatypes) expected by each node:
Excel template metadata worksheet for datatypes used by Collection or Set branch nodes.#
Note
Bulk Uploading Files
If you want to import resource instances that include datatype “file-list” nodes, then the files associated with those nodes will need to be imported along with the Excel workbook. To do this, zip compress a folder that includes the Excel workbook to be imported along with the associated files (such as image files) named in the “file-list” nodes.
The Edit tab of the Bulk Data Manager enables Arches administrators to make mass edits of string data across many resource instances.
As of version 7.5.0, the current string editing options include:
Bulk Deletion
Change case (uppercase, lowercase, capitalize)
Replace Text
Remove Whitespace
Editing operations require all or some of the following options:
Seach URL (optional) - Defines the bounds of what resources can be edited. Actual edited resources could be less then what the search defines (see below).
Resource Model - Resource instances of the model to edit
Node - The node value in each resource instance to edit
Nodegroup - (Deletion only) the tile associated with the nodegroup to delete
Language - The language to update in each node
From and To - (Replace Text only) the text you would like to search and replace
Search URL details
Copy and paste a URL of a search that retrieves a set of resource instances that you want to limit your bulk edit operation to.
This does not mean that those resources will actually be edited, only that resources that don’t fall within that search result won’t be edited.
For example, in a capitalize operation:
If a search URL returns 3 records but one of them is already capitalized then only the remaining 2 uncapitalized records will be updated.
If a search URL returns 3 records but the node in the model contains more then 3 records that are uncapitalized, then only the 3 records defined in your search will be updated.
Preview button
Once you’re satisfied with the options you’ve selected click the preview button to preview a
small set of records that match your criteria to see the before and after of the edit operation.
Start button
Click the start button if you’d like to actually kick off the edit operation. You will be taken to the Task Status tab.
Depending on the operation selected and the number of resources being edited, this can take some time.
Edit operations are placed into a work queue and at this point you can leave this page. The Task Status
will update itself every 5 seconds (there is no need to refresh the page).
The Export tab of the Bulk Data Manager enables Arches administrators to make mass exports of resource instance data. The exported data will be in Excel workbooks. You choose to export data expressed in either a “Branch” or a “Tile” structure.
Export of a resource instance data into an Excel workbook with the Branch structure#
If you have resource instances that include datatype “file-list” nodes, then the files associated with those nodes will be exported into a zip file.
The Tile and Branch data export will export data in exactly the same formats used with the corresponding Bulk Data manager importers. This means that you can use the Bulk Data Manager to export data, make edits to the exported data, and then re-import the edited data. This can be useful for making mass edits to data that is not easily edited in the Arches user interface. The data made a available through the Export tools will also provide invaluable examples of how to express data in a manner suitable for import.
Example “Branch” Excel export of resource instance data#
Arches is a complex platform, and some users must be able to access specific areas of the application while being restricted from others.
This level of access is handled by adding Users to certain Groups through the Django admin interface.
Note
You can access the Django admin at localhost:8000/admin, the default admin credentials are admin/admin,
which must be changed in production. Any user with “staff” status can access the Django admin panel.
Once logged into the admin panel, you’ll see this at the top of the page:
Arches site administration in Django admin panel.#
Click Users to see a list of all your Arches users. Selecting a user will yield a generic
profile page like this:
In the “Permissions” section here there are three fields.
Active:
This account is active and the user can log in. Unchecking this box allows you to retain a user account while disallowing them from accessing Arches.
Staff status:
This user can access (and make changes within) the Django admin panel.
Superuser status:
This user has full access to the entire Arches platform, and is considered a member of every Group.
Next, you’ll see where you can assign the user to different Groups. Arches comes with many default
different groups, and each one gives its members access to different parts of the application. A user can be
a member of as many different groups as needed.
Graph Editor:
Use Case:
For creating and testing branches and models.
Access Privileges:
Create/design graphs, branches, functions, and RDM. Add/edit business data with Resource Editor privileges. Unable to access system settings or mobile projects.
Resource Editor:
Use Case:
Ability to add/edit/delete provisional data more liberally than a Crowdsource Editor user.
Access Privileges:
Add/edit/delete resources.
RDM Administrator:
Use Case:
Add/edit/manage RDM concepts
Access Privileges:
Full access to the RDM - no access to the rest of Arches.
Application Administrator:
Use Case:
Control over Django admin page… can add/edit/delete users and user groups within Django admin console
Access Privileges:
Has Django superuser status (see above) which gives it full access to Arches.
System Administrator:
Use Case:
Changing data stored in the system settings graph.
Access Privileges:
Ability to access/edit data in Arches System Settings.
Crowdsource Editor:
Use Case:
Creation of provisional data from an untrusted source. Default group user is assigned to when first added to the system via e-mail sign-up.
Access Privileges:
Add/edit/delete resources your own provisional data tiles
Guest:
Use Case:
Read-only access for anonymous users (non-authenticated users are automatically in this group)
Access Privileges:
Read-only access to all business data
Resource Reviewer:
Use Case:
Review provisional data and promote it to authoritative
Access Privileges:
Add/Edit authoritative business data. Ability to promote provisional data to authoritative.
Resource Exporter:
Use Case:
Control permissions to make exports of search result resource instances
Access Privileges:
This group was added in Arches version 7.4.0. Membership in this group is now required to export resource instance data from search results. By default, the anonymous user is a member of this group. If you want to disable export of resource instance data from searches for anonymous users, remove the anonymous user from this group. Similarly, you can control resource instance export privileges for other users by adding or removing them from the ResourceExporter group.
Feel free to make new groups as needed, but do not remove any of those listed above. Groups are also used in
other aspects of permissions as described below.
Permissions are applied to each card and by default, the guest user (aka anonymous user) has read privileges to all data.
If you have data you do not want to share with all users, follow these directions when designing your database: Permissions Tab.
If you want to ensure that all media file (uploaded photographs, etc.) access requires authentication, you can set RESTRICT_MEDIA_ACCESS to True.
Be aware that in doing so, all media file requests will be served by Django rather than Apache. This will adversely impact performace when serving large files or during periods of high traffic.
As of Arches version 7.4.0, you can assign different permissions to specific Arches users and groups. To manage Map Layer Permissions, login to the Django Admin User Interface and click the link to “Map layers” under models, and then click on the specific Map Layer that you’d like to update for permissions.
To update permissions of a specific Map Layer, navigate to the OBJECT PERMISSIONS link in the upper right as illustrated below:
Link to the Object Permissions update form for a Map Layer in the Django Admin panel.#
Note
You will ALSO need to make sure the Ispublic flag for the Map Layer is deactivated. That flag is located lower down, well below the the link to the OBJECT PERMISSIONS, see below:
Location of the Ispublic flag for a Map Layer in the Django Admin panel.#
Once you click the OBJECT PERMISSIONS link, you will see a form that will let you name users (by their username) and groups, by their group name. Once you add the name for the user or group, press the “Manage user” or “Manage group” button as appropriate. See the illustration below for an example:
Adding a Group Name to the Object Permissions for a Map Layer in the Django Admin panel.#
After clicking the “Manage user” or “Manage group” button, you will reach another form where you can add or subtract specific permissions for this user (or group) and Map Layer. See the illustration below for an example:
Editing a group’s specific permissions to a Map Layer in the Django Admin panel.#
Once you have updated the permissions, it’s a good idea to test the Arches interface to make sure the permissions for the Map Layer are properly applied.
This feature is in preview and therefore is not feature complete or may have some bugs.
As of Arches version 6.1.0, it is possible to create spatial views of resource instance data that can be consumed
by any client that supports PostGIS spatial views.
Currently the preview only allows the spatial views to be created in Django admin by managing the Spatial Views entities.
The spatial views are only able to represent the data in a flattened state, meaning that the data in nested cards are
flattened into a single comma separated attribute value, with the card sort order honoured. Therefore, it is important
to consider how to attribute the views being created.
The Spatial View model schema is defined as follows:
Spatialviewid
Unique identifier for the spatial view.
Schema
The database schema that the spatial view belongs to. public is used by
default but if another is used then it must have already been created in the database.
Slug
This is will be joined with the Schema to form the name of the spatial
view. This value must follow slug format of only lower-case letters, numbers,
and hyphens. It cannot start with a number.
Description
The text that is added as a comment on the spatial view in the database
, which can be accessed as metadata for consuming clients where supported.
pg_featureserv for example will present this as the layer description.
Geometrynodeid
The UUID of the geojson-feature-collection node that underpins the geometry
of the spatial view.
Ismixedgeometrytype
Boolean value that indicates whether the geometry of the spatial view is a
mix of different geometry types. This is ideal where
the spatial view will be used by a vector tile service.
Default value is false.
Attributenodes
A JSON object that contains a list of attribute object defining the UUIDs of
the nodes that comprise the attributes of the spatial view and a text description
of that attribute for metadata.
Note
The name of the attributes are automatically generated from the node name using Postgresql a compliant format.
The UUID of the node that needs adding. This must be in the same model and the Geometrynodeid.
description
The text description of the attribute, which will be added as metadata.
Isactive
Boolean value that indicates whether the spatial view is available. When set to
false the spatial view is removed from the database, but allows the definition
to remain. Setting to true recreates the spatial view in the database.
To use the spatial views in your client application or datasource for a service, you will need to configure that client
to connect to the database using the following credentials:
host: the hostname of the arches database server
port: the port of the arches database server
database: the name of the arches database
user: arches_spatial_views
password: arches_spatial_views
If you are using a client that requires views to geometry type specific (for example ArcGIS), ensure that you have set Ismixedgeometrytype to false.
Important
Currently it is not possible to use the user/groups permissions to restrict access.
You will need to manually create specific database users and assign them to the spatial views.
You may create new Resources only if you have access to the Resource Manager page. From there, you will begin by choosing which Resource Model you would like to use. Note that a Resource Model must have its status set to active for it to appear in the Resource Manager.
Your Resource Manager page may look different than this image, depending on what Resource Models you have set up in your database.#
The Resource Editor is used to create new or edit existing Resources. On the left-hand side of the page you will see this Resource’s “card tree”, which shows all of the data entry cards that you can edit. Think of “creating data” as “adding cards”.
To begin, select a card, enter data, and click Add. Some cards may allow multiple instances, in which case you will be able to add as many of the same type as you want.
If you are a member of the Resource Editor group, all of your edits–either creating new resources or editing existing ones–will be considered “Provisional”. A member of the Resource Reviewer group can then approve your edits, making them “Authoritative”.
Resource Editor makes an edit:
For Resource Reviewers, search results indicate provisional data:
Resource Editors only see provisional data while using the resource editor.
Resource Reviewer will be prompted to Q/A the edit:
Accept or Decline:
Approved edits are immediately visible:
Tip
A Resource Reviewer can also use the “Q/A Type” search filter (see images above) to only find resources with (or without) provisional edits.
Managing generic relationships as described below is still an available feature in Arches. However, this feature will soon be deprecated in a future release.
Users are strongly encouraged to use the resource-instance datatype to manage relationships between resource instances.
The ability to visualize connections across resource-instance datatype nodes will accompany the deprecation of the generic resource relationship.
From the Resource Editor you can also access the Related Resources Editor, which is used to create a relationship between this resource and another in your databas. To do so, open the editor, find the resource, and click Add. Your Resource Model will need to be configured to allow relations with the target Resource Model. If relations are not allowed, resources in the dropdown menu will not be selectable.
After a relation has been created, you can further refine its properties, such as what type of relation it is, how long it lasted, etc. While viewing the relation in grid mode, begin by selecting the relation in the table. You will see the “Delete Selected” button appear. Next click “relation properties”, enter the information, and don’t forget to “Save” when finished.
Creating a relationship between two resource in Arches, and adding properties to that relationship.#
Note
Creating a relationship between two resources using the related resource editor is fundamentally different from creating a resource instance node in graph. Creating a relationship is good for making a visual “web” of resource relationships. Using a resource instance node in a Resource Model’s graph allows you to “embed” one resource inside of another.
Arches provides user interface features to delete node instances used to describe resource instances. Arches also provides user interface features to delete any single given resource instance. Finally, Arches provides a feature to delete all resource instances for a given resource model.
To find a resource instance that you want to delete, you can start by using the search interface (learn more here: Searching). Assuming you are logged in as a user with permissions to edit the resource instance of interest, you should find a link to Edit the resource instance in the search results. The animation below illustrates the use of search and the Edit link:
One you have opened a resource instance to edit, you have the option to delete (and then update if you choose) node instances associated with the resource instance. Node instance data are data that describe a given resource instances according to the structures defined by “branches”. The animation below illustrates different options one can use to delete an example annotation node instance:
Sometimes you may wish to entirely delete a given resource instance. To do so, follow the directions above to find the resource instance you wish to delete and follow the Edit link. If you have edit permissions, on the resource instance edit page, you should see a Manage button toward the upper left corner of the page. Click on this, and select the Delete Resource option and then confirm your choice if you are sure you want to permanently delete the resource instance. See the animation below for an example resource instance deletion:
Delete ALL Resource Instances for a Resource Model#
Arches also provides a user interface feature for bulk deletion of all resource instances created for a given resource model. To do so, navigate to the Arches Designer and select the Resource Models option. Then hover over the resource model of interest, and click on the Manage options on the right. You can then select the Delete Instances option, and after confirming your choice, you will delete all resource instances for that resource model. The animation below illustrates deletion of all resource instances for a resource model:
Warning
Obviously, deleting all resource instances for a given resource model can be a drastic measure. It may be a good idea to export and backup your data prior to such major changes (see Export Commands).
The Arches application itself comes bundled with brief guides on the use of search features. Users can open these guides by clicking on the ? button in the top right of the search interface. The animation below illustrates activation of the search guide.
Arches supports powerful features to search text. The text search can be used to match strings of text in any branch describing resource instances. One can also click on a search term for opposite effect, that is, to negate or exclude resource instances that match a given term. The animation below illustrates a text search to find resource instances containing the text “Iran” then a text search that excludes resource instances that contain the term “Iran”.
The Arches search feature supports a number of operators that you can use to find and retrieve information. The following lists different operator types and their expected search behavior:
Exact: Putting your search term within quotes executes an exact search. An exact search only finds values that match then entire search string exactly, including case. For example, if the search term is “"Excavation Unit"” it will find the value “Excavation Unit 4”. It will NOT find “… this excavation unit …“.
Like: This is a prefix based search. It will search for strings that begin with each term in the search string in any order and that there is a match for each term. It is not a “contains” search. For example, if the search term is “st bo” it will find “… stock book …” , or “… books in stock …”. It will NOT find a value with just “book” or find the value “last book” (where “last” ends in “st”).
Equals: This is a complete phrase search. It will only match full words in the exact order given. For example, if the search term is “stock book” it will find the value “Knoedler Stock Book 4”. It will NOT find “… books in stock …“, or “… stocking new books …”
Wildcard: This assumes the user is going to supply a search phrase exactly as they want. For example, a search for “st?ck” will find the values of “stock”, or “Stick”.
It will NOT find “The car is stuck in the woods”, because the “stuck” is surrounded by other words. If you need a “contains-like” query, then you need to supply leading and trailing asterisks. So, “*st?ck*” WILL find “The car is stuck in the woods”.
In all the examples above we quote the search string for clarity. In the Arches application you wouldn’t need to put your search terms inside quotes.
Arches version 7.4 introduced features that allow users to use backslashes (”\”) to escape special characters used as search operators. For example, if one wanted to do a search for the string “sheep?”, where the retrieved text needs to include the question-mark “?” character, one would need to escape the “?” character so that it is NOT interpreted as a Wildcard search operator (see above). The string “sheep\?” escapes the “?” character and would search for the string “sheep?”.
Arches also has “Advanced Search” options that enable users to search with specific branches to find resource instances. This adds much more precision to a search. For example, if you have resource instances described by both a “Color” branch and a “Material” branch a simple search for the term “gold” may return some unwanted results. The Advanced Search allows you to specify that you want to search within the “Material” branch, so a search for the term “gold” will return resource instances described by the material “gold”, not just the color “gold”.
Advanced Search is very powerful because you can use it to combine multiple search criteria together and compose complex queries. The animation below illustrates use of the Advanced Search option to search within specific branches.
This section provides information on command line utilities, extensions, software and template customizations, API integrations, data modeling and other topics that are relevant to developers.
Arches is very flexible and customizable. This section provides guidance on how to use APIs to integrate Arches with other information systems, enhance accessibility, build custom extensions, and modify Arches to deploy custom features beyond the capabilities of standard (uncustomized, core) Arches.
If you are considering software development to customize Arches, please read the Arches Customization Considerations for an introduction about good practices to help make customizations easier to develop, sustain, and maintain.
If you are leading a project or organization considering customizing Arches software, please read this document carefully. Customization is an inherently risky endeavor, especially if you need to maintain and support your information systems over multiple years.
The practices described here will reduce costs, reduce long term maintenance and security risks, and will lead to greater impact, enhanced sustainability, and open doors for future opportunities. That said, to maximize sustainability, security, maintainability, quality and impact, it is best practice to coordinate and discuss customization plans with the wider Arches open source community. If you haven’t already done so, please join the Arches Community Forum!
To increase the likelihood that customizations will have long term compatibility and maintainability with Arches, please use the customization patterns supported and documented by Arches. These patterns include:
Adherence to the extension design patterns helps to isolate your customizations from changes to the core of Arches. Following Arches extensions design patterns will also increase the likelihood that there will be relevant documentation and community help if the extensions need updates in the future. Certain customizations are easier to maintain over time. For example, an overwritten HTML template is generally simpler to upgrade than an inherited Arches Python class or an overwritten Django view. You should factor such considerations into long term resource planning.
While the Arches extensions architecture offers a great deal of flexibility, there may be scenarios where you need additional flexibility. From a sustainability and maintenance perspective, this scenario has important risks that need to be understood and factored into long term resource planning and engineering.
Managing long term sustainability and maintenance risks should be a core software engineering focus. As much as possible, you should ideally isolate your customized module as much as possible from the core of Arches. One way preferred way to accomplish this is to develop Arches Apps (see: Creating Applications), which are discrete Python packages that can be integrated into one or more Arches projects. The Arches Apps documentation details their sustainability advantages.
The Arches API can be used to support customizations, especially those involving integration of Arches with other information systems. Channeling all connections between Arches and other systems through the API aligns with a design practice often described as “Loose Coupling”. By carefully limiting and simplifying how core Arches interfaces with external information systems, you reduce future maintenance burdens, because problems can be identified and fixed in a more focused manner.
Things break over time, especially if they are customized and not widely supported by broader community. One way to help manage long term risks is to plan for “graceful degradation”. If your custom module is isolated from core Arches (via Arches App development and/or loose coupling of API integrations), then if it breaks or can no longer be maintained, the core Arches system should still be perfectly serviceable. Planning for obsolescence and the retirement of hard-to-maintain components is often essential in contexts where Arches is deployed, especially in the cultural heritage sector.
The following is our recommedation for creating an Arches environment that works well for developers. The first thing to consider is the general structure that will be in place, presumably all in the same directory:
Runtime Content
ENV/ - A Python 3.10+ virtual environment (you can name this whatever you want).
arches/ - The local clone of your fork of the archesproject/arches repo, this part of the code is often referred to as “core Arches.”
my_project/ - The location of your Arches project. This is the app in which you will be making the majority of your front-end customizations (new images, new template contents, etc.).
Database Configuration Storage
my_package/ - The location of your Arches package. Packages can store custom database definitions that you will create, and are loaded into a project through a one-time command line operation.
You may also be planning to use externally hosted components, like a remote Postgres/PostGIS or Elasticsearch installation. In that case make sure you have the connection information handy, you will need it in a later step.
will give you the stable branch for the 6.0.0 release.
Install the local core Arches
This is instead of using pipinstallarches which would install the pypi Arches distribution directly into ENV. When you install the local clone as shown below, any code changes you make inside of arches/ (like checking out a new git branch) will be immediately reflected in your runtime environment.
If you later switch to a new git branch, you may need to rerun pipinstall-rarches/install/requirements.txt, as the Python dependencies do change over the course of Arches releases.
Think of the packages as external storage for complex database configurations like Resource Models, or custom components like Datatypes. A package allows you to back up and share this type of content outside of the project itself. In some cases, however, projects and packages can become interdependent.
Look at Understanding Packages for more information on how to create and maintain packages.
In your project you can overwrite core Arches functionality in many ways. In general, doing so is preferable to directly altering any code in core Arches.
To overwrite existing (or add your own) style rules, create project.css in your project’s media directory like this: my_project/my_project/media/css/project.css and place style content in there. By default, these rules are linked in the base Arches UI templates. To use these same rules on the splash page, add
For static files such as these, if you create a file in your project that matches the relative directory structure and name of that same file in core Arches, Django will inherit your new file and ignore the original Arches one.
It is much more complex to override dynamic content like a core Arches view, but entirely possible. For example, you could create views.py in your project and define a new view class in it like this, which inherits a core Arches view class.
fromarches.app.views.userimportUserManagerViewclassMyUserManagerView(UserManagerView):## add a random print statement to make sure this class is usedprint("in MyUserManagerView")pass
from.viewsimportMyUserManagerViewurlpatterns=[# match and return your custom view before the default Arches url can get matched.path("user/",MyUserManagerView.as_view(),name="user_profile_manager"),path("",include("arches.urls")),]+static(settings.MEDIA_URL,document_root=settings.MEDIA_ROOT)
which will cause /user to match your new view before the core Arches /user url is found. Thus, going to localhost:8000/user will still return the default Arches profile manager page, but it has been passed through your class. You can now add a get() method to your class and it will be called to return the view instead of arches.app.views.user.UserManagerView().get().
Note
Remember: Arches is built with Django, so your best resource for more in-depth customization of projects is the Django documentation itself.
Warning
As a rule of thumb, the more complex the customizations are that you add to a project, the more difficult it will be retain these changes when you upgrade to later core Arches versions.
With the local clone of core Arches linked to your virtual environment, you can upgrade by simply pulling the changes to your local clone of the repo, or switching to a new release branch.
To upgrade projects, check the release notes which typically contain detailed instructions.
In general, you should always expect to
Reinstall Python dependencies in core Arches:
(ENV)$ cd arches
(ENV)arches/$ pip install -r requirements.txt
Apply database migrations in my_project:
(ENV)$ cd my_project
(ENV)my_project/$ python manage.py migrate
Reinstall javascript dependencies in my_project/my_project:
(ENV)$ cd my_project/my_project
(ENV)my_project/my_project$ yarn install
Finally, if you have added custom logic or content to your project, you must make sure to account for any changes in the core Arches content that you have overwritten or inherited.
Tests must be run from core Arches. Enter arches/ and then use:
(ENV)arches/$ python manage.py test tests --pattern="*.py" --settings="tests.test_settings"
It is possible that you will need to add or update settings_local.py inside of arches/ in order for the tests to connect to Postgres and Elasticsearch.
Arches uses Elasticsearch as its search engine. A handful of settings.py variables point your Arches project to an Elasticsearch installation, in which your indexes will be created. An ELASTICSEARCH_PREFIX string is prepended to all of your project’s indexes, meaning that a single Elasticsearch installation can be used by multiple projects.
One important thing to remember: Elasticsearch indexes are replicable derivatives of your Arches database, meaning that they can safely be dropped and recreated at any time. Similarly, if you need to change or upgrade your Elasticsearch setup, you need only update some settings and then reindex your database.
You can install Elasticsearch locally alongside Arches–read on for how to do that. You can also use managed Elasticsearch solutions by cloud providers like AWS.
The easiest way to install Elasticsearch is to download and unpack their archived releases. Archives are available at https://www.elastic.co/downloads/past-releases/elasticsearch-{releasenumber}, e.g. https://www.elastic.co/downloads/past-releases/elasticsearch-8-5-1.
Download the release for your OS and architecture and then unpack/unzip it. For example, installing 8.5.1 on Ubuntu Linux looks like:
Elasticsearch 8 introduced new security features. While you are working with Arches locally, i.e. during development, you can safely disable these features. Do not disable security features in production.
Make two changes:
In your config file, find xpack.security.enabled=true and set it to xpack.security.enabled=false. Now start/restart Elasticsearch (see below).
The config file is typically found at {path-to-elasticsearch}\config\elasticsearch.yml. If you installed the Debian package, you’ll find it at /etc/elasticsearch/elasticsearch.yml.
In your Arches project’s settings.py or settings_local.py, add
To rum in the background, add -d to that command. To stop the background process, use psaux|grepelasticsearch to get the process id, and then sudokill<processid>-9.
Windows:
On Windows, double-click the {path-to-elasticsearch}\bin\elasticsearch.bat batch file to run the process in a new console window.
To make sure Elasticsearch is running correctly, use
curllocalhost:9200
You should get a JSON response that includes “You Know, For Search…”. You can also use the Chrome plugin ElasticSearch Head to view your instance in a browser at localhost:9200.
By default, Elasticsearch uses 2GB of memory (RAM). For basic development purposes, we have found it to run well enough on 1GB. Use ES_JAVA_OPTS="-Xms1g-Xmx1g"./bin/elasticsearch-d to set the memory allotment on startup (read more). You can use the same command to give more memory to Elasticsearch in a production setting.
Important
If you get an empty response from curllocalhost:9200, this is likely because Elasticsearch security features are not probably set up, see Development Configuration above.
You may need to reindex the entire database now and then. This can be helpful if a bulk load
failed halfway through, or if you need to point your database at a different Elasticsearch installation.
Be warned that this process can take a long time if you have a lot of resources in your database.
Also, if you are in DEBUG mode it can cause your server to run out of memory.
In production it’s advisable to have multiple Elasticsearch instances working together as nodes of
a single cluster. To do this, you need to install a second Elasticsearch instance, and change the
config/elasticsearch.yml file in each instance. Note that the cluster and node names can be whatever
you want, as long as the cluster.name is the same in both instances and the node.name is unique
to each one.
You’ll need to start/stop each of these instances individually, but you should always
have both running. When they are, the secondary node will automatically find the master
node and the indices will be replicated between the two.
Nothing about your project’s settings.py should change; Arches need only connect
to the original Elasticsearch instance as before. However, you’ll see now in the console output
that the cluster health will be [GREEN] when you have two nodes running (it’s [YELLOW]
if you only have one).
Arches allows you to create a custom index of resource data for your specific use case (for use in Kibana for example).
To add a new custom index create a new python module and add to it a class that inherits from arches.app.search.base_index.BaseIndex and implements the prepare_index and get_documents_to_index methods.
ELASTICSEARCH_CUSTOM_INDEXES=[{'module':'{path to file with SampleIndex class}.SampleIndex','name':'my_new_custom_index'<--followESindexnamingrules,usethisnametoregisterinElasticsearch}]
Arches allows any parameters to be passed in via custom HTTP headers OR via the querystring.
All requests to secure services require users to pass a “Bearer” token in the authentication header
To use a an HTTP header to pass in a parameter use the form:
Querystring parameters are case sensitive. Behind the scenes, custom header parameters are converted to lower case querystring parameters.
In the following example there are 3 different parameters (“format”, “FORMAT”, and “Format”) with 3 different values (“html”, “json”, and “xml”) respectively
Most Arches API endpoints require an OAuth access token.
OAuth 2.0 is a simple and secure authentication mechanism. It allows applications to acquire an access token for Arches via a quick redirect to the Arches site. Once an application has an access token, it can access a user’s resources on Arches. to authenticate with OAuth you must first Register an OAuth Application.
gets an OAuth token given a username, password, and client id
Note
You should only make this call once and store the returned token securely. You should not make this call per request or at any other high-frequency interval.
This token is to be used with clients registered with the “Resource Owner Password Credentials Grant” type
see Register an OAuth Application for more information on registering an application
returned when an invalid client id is supplied, or the registerd client is not “public” or the
grant type used to register the client isn’t “Resource Owner Password Credentials Grant”
curl -H "Authorization: Bearer {token}" -X GET http://localhost:8000/rdm/concepts/{concept instance id}
curl -H "Authorization: Bearer zo41Q1IMgAW30xOroiCUxjv3yci8Os" -X GET http://localhost:8000/rdm/concepts/5e04c83e-1ae3-42e8-ae31-4f7c25f737a5?format=json&indent=4
Example json response:
HTTP/1.0200OKContent-Type:application/json{"hassubconcepts":true,"id":"5e04c83e-1ae3-42e8-ae31-4f7c25f737a5","legacyoid":"http://www.archesproject.org/5e04c83e-1ae3-42e8-ae31-4f7c25f737a5","nodetype":"Concept","parentconcepts":[{"hassubconcepts":true,"id":"7b8e4771-2680-4004-9743-40ea78e8c2a9","legacyoid":"http://www.archesproject.org/7b8e4771-2680-4004-9743-40ea78e8c2a9","nodetype":"ConceptScheme","parentconcepts":[],"relatedconcepts":[],"relationshiptype":"hasTopConcept","subconcepts":[],"values":[{"category":"label","conceptid":"7b8e4771-2680-4004-9743-40ea78e8c2a9","id":"b18048a9-4814-43f0-bb88-99fa22a42fbe","language":"en-US","type":"prefLabel","value":"DISCO"},{"category":"note","conceptid":"7b8e4771-2680-4004-9743-40ea78e8c2a9","id":"16ea8772-d5dd-481d-91a7-c09703718138","language":"en-US","type":"scopeNote","value":"Concept scheme for managing Data Integration for Conservation Science thesauri"},{"category":"identifiers","conceptid":"7b8e4771-2680-4004-9743-40ea78e8c2a9","id":"9eaa8a10-e9f2-4ce3-ac8b-c4904097b4c9","language":"en-US","type":"identifier","value":"http://www.archesproject.org/7b8e4771-2680-4004-9743-40ea78e8c2a9"}]}],"relatedconcepts":[],"relationshiptype":"","subconcepts":[{"hassubconcepts":false,"id":"0788acb1-9968-43e8-80f7-37b37e155f95","legacyoid":"http://www.archesproject.org/0788acb1-9968-43e8-80f7-37b37e155f95","nodetype":"Concept","parentconcepts":[{"hassubconcepts":false,"id":"5e04c83e-1ae3-42e8-ae31-4f7c25f737a5","legacyoid":"http://www.archesproject.org/5e04c83e-1ae3-42e8-ae31-4f7c25f737a5","nodetype":"Concept","parentconcepts":[],"relatedconcepts":[],"relationshiptype":"narrower","subconcepts":[],"values":[]}],"relatedconcepts":[],"relationshiptype":"narrower","subconcepts":[],"values":[{"category":"label","conceptid":"0788acb1-9968-43e8-80f7-37b37e155f95","id":"dd5c6d39-7bc4-438e-abe2-544b8ae06864","language":"en-US","type":"prefLabel","value":"Artist"},{"category":"identifiers","conceptid":"0788acb1-9968-43e8-80f7-37b37e155f95","id":"5f355975-29a7-4a53-8260-4093d63c1967","language":"en-US","type":"identifier","value":"http://www.archesproject.org/0788acb1-9968-43e8-80f7-37b37e155f95"}]}],"values":[{"category":"label","conceptid":"5e04c83e-1ae3-42e8-ae31-4f7c25f737a5","id":"b75ca80a-3128-421d-ae2b-aacb7d12bbc7","language":"en-US","type":"prefLabel","value":"DISCO Actor Types"},{"category":"identifiers","conceptid":"5e04c83e-1ae3-42e8-ae31-4f7c25f737a5","id":"79d2e5d2-91fc-435d-869a-042c994d3481","language":"en-US","type":"identifier","value":"http://www.archesproject.org/5e04c83e-1ae3-42e8-ae31-4f7c25f737a5"}]}
Accept – optional alternative to “format”, {“application/xml”, “application/json”, “application/ld+json”}
Example request:
curl -H "Authorization: Bearer {token}" -X GET http://localhost:8000/resources/{resource instance id}
curl -H "Authorization: Bearer zo41Q1IMgAW30xOroiCUxjv3yci8Os" -X GET http://localhost:8000/resources/00131129-7451-435d-aab9-33eb9031e6d1?format=json&indent=4
Example json response:
HTTP/1.0200OKContent-Type:application/json{"business_data":{"resources":[{"tiles":[{"data":{"e4b37f8a-343a-11e8-ab89-dca90488358a":"203 Boultham Park Road""e4b4b7f5-343a-11e8-a681-dca90488358a":null,},"provisionaledits":null,"parenttile_id":null,"nodegroup_id":"e4b37f8a-343a-11e8-ab89-dca90488358a","sortorder":0,"resourceinstance_id":"99131129-7451-435d-aab9-33eb9031e6d1","tileid":"b72225a9-4e3d-47ee-8d94-52316469bc3f"},{"data":{"e4b3f15c-343a-11e8-a26b-dca90488358a":null,"e4b4ca3d-343a-11e8-ab73-dca90488358a":{"type":"FeatureCollection","features":[{"geometry":{"type":"Point","coordinates":[-0.559288403624841,53.2132233001817]},"type":"Feature","id":"c036e50a-4959-4b6f-93d0-2c03068c0948","properties":{}}]}},"provisionaledits":null,"parenttile_id":"4e40e6f3-8252-4439-831d-c371655cc4eb","nodegroup_id":"e4b3f15c-343a-11e8-a26b-dca90488358a","sortorder":0,"resourceinstance_id":"99131129-7451-435d-aab9-33eb9031e6d1","tileid":"65199340-32c3-4936-a09e-7c5143552d15"},{"data":{"e4b386eb-343a-11e8-82ef-dca90488358a":"Detached house built by A B Sindell"},"provisionaledits":null,"parenttile_id":"8870d2d6-e179-4321-a8bb-543fd2db63c6","nodegroup_id":"e4b386eb-343a-11e8-82ef-dca90488358a","sortorder":0,"resourceinstance_id":"99131129-7451-435d-aab9-33eb9031e6d1","tileid":"04bb7bef-1e6e-4228-bd87-3f0a129514a8"}],"resourceinstance":{"graph_id":"e4b3562b-343a-11e8-b509-dca90488358a","resourceinstanceid":"99131129-7451-435d-aab9-33eb9031e6d1","legacyid":"99131129-7451-435d-aab9-33eb9031e6d1"}}]}}
Instead of identifying a graph by a UUID, one can also identify a graph by by a slug identifier.
To get or set the slug for the graph, navigate to the root node of the Graph Designer. A request using a slug identifier for a graph looks like:
PUT/resources/{string:graphslug}/{uuid:resourceinstanceid}
get a list of mobile data collection projects that a user has been invited to participate in
Example request:
curl -H "Authorization: Bearer {token}" -X GET http://localhost:8000/mobileprojects
curl -H "Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJ1c2VySWQiOiJiMDhmODZhZi0zNWRhLTQ4ZjItOGZhYi1jZWYzOTA0NjYwYmQifQ.-xN_h82PHVTCMA9vdoHrcZxH-x5mb11y1537t3rGzcM" -X GET http://localhost:8000/mobileprojects
Example response:
HTTP/1.0200OKContent-Type:application/json[{"active":true,"bounds":"MULTIPOLYGON EMPTY","cards":[],"createdby_id":1,"datadownloadconfig":{"count":1000,"custom":null,"download":false,"resources":[]},"description":"A description of this project.","enddate":"2018-03-16","groups":[6],"id":"e3d95999-2323-11e8-894b-14109fd34195","lasteditedby_id":1,"name":"Forbidden Project","startdate":"2018-03-04","tilecache":"","users":[1]}]
returns a GeoJSON representation of resource instance data; this will include metadata properties when using paging for “_page” (number) and “_lastPage” (boolean). Returned features will include integer ids that are only assured to be unique per request.
NOTE: when not using the “use_uuid_names” parameter, field names will use the export field name provided for a given node (via the Graph Designer).
If the export field name is not defined, the API will attempt to create a suitable field name from the node name.
Property names that clash as a result of the above, or shortening via “field_name_length” will have their values joined together.
WARNING: including primary names has a big impact on performance and is best defered to an additional request
Query Parameters:
resourceid – optional comma delimited list of resource instance UUIDs to filter feature data on
nodeid – optional node UUID to filter feature data on
tileid – optional tile UUID to filter feature data on
nodegroups – optional comma delimited list of nodegroup UUIDs from which to include tile data as properties.
precision – optional number of decimal places returned in coordinate values; used to constrain resultant data volume
field_name_length – optional number to limit property field length to
use_uuid_names – include this parameter to return tile property names as node UUIDs.
include_primary_name – include this parameter to include resource instance primary names in feature properties.
use_display_values – include this parameter to return tile values processed to be human readable
include_geojson_link – include this parameter to include a link to this specific feature in its properties fit for reuse later
indent – optional number of spaces with which to indent the JSON return (ie “pretty print”)
type – optional geometry type name to filter features on
limit – optional number of tiles to process; used to page data. NOTE: as paging is per tile, the count of features in the response may differ from this limit value
page – optional number of page (starting with 1) to return; used in conjunction with “limit”
Example request:
curl -X GET http://localhost:8000/geojson?nodegroups=8d41e4ab-a250-11e9-87d1-00224800b26d,8d41e4c0-a250-11e9-a7e3-00224800b26d&nodeid=8d41e4d6-a250-11e9-accd-00224800b26d&use_display_values=true&indent=2&limit=3
Example response:
HTTP/1.0200OKContent-Type:application/json{"_lastPage":false,"_page":1,"features":[{"geometry":{"coordinates":[-0.09160837,51.529378348],"type":"Point"},"id":1,"properties":{"application_type":"Enquiry","consultation_status":"Dormant","consultation_type":"Post-Application","development_type":"Mixed Use","name":"Consultation for 93 Mendota Alley","resourceinstanceid":"aa7ecf38-ab81-4e08-bb74-cfdd1e339ea2","tileid":"4e4d8fe8-3ee9-4ddc-9613-fffc1511bd58"},"type":"Feature"},{"geometry":{"coordinates":[-0.090902277,51.533642427],"type":"Point"},"id":2,"properties":{"application_type":"Listed Building Consent","consultation_status":"Completed","consultation_type":"Condition Application","development_type":"Land restoration","name":"Consultation for 57359 Fieldstone Way","resourceinstanceid":"2cf195f8-805b-4f97-9133-cbd94bf5a01f","tileid":"6e3009d4-4022-4510-8e42-504b5bc20b74"},"type":"Feature"},{"geometry":{"coordinates":[-0.088202575,51.533347841],"type":"Point"},"id":3,"properties":{"application_type":"Listed Building Consent","consultation_status":"Aborted","consultation_type":"Post-Application","development_type":"Road construction","name":"Consultation for 3660 Kim Court","resourceinstanceid":"eefa863a-53e4-404a-89b4-6213b46b2b55","tileid":"99395221-dd7f-4a06-8d87-5f5703501ab5"},"type":"Feature"}],"type":"FeatureCollection"}
This page serves as a quick reference guide for working with Arches
through a command prompt. Along with default Django commands, a good
deal of Arches operations have been added to manage.py. In a
command prompt, [activate your virtual
environment](Dev-Installation#4-activate-the-virtual-environment),
then run the following commands from your root app directory (the one
that contains manage.py).
_All file or directory path parameters (-s, -c, -d) should
be absolute paths._
This argument with the value . indicates to pip that it should link the local directory with the virtual environment.
Installs Arches into your virtual environment from a local clone of
the archesproject/arches
repo, or your own fork of that repo. To do this properly, create a new
virtual environment and activate it, clone the repo you want, enter
that repo’s root directory, and then run the command. Also, this
command must be followed by:
pipinstall-rarches/install/requirements.txt
in order to properly install all of Arches’ python requirements. Make
sure to use \ instead of / on Windows.
Add this boolean argument to force the destruction
and recreation of your database before loading the package.
The source (-s) of a package can be either a path to a local
directory, the location of a local zipfile containing a package, or
the url to a github repo archive that contains a package. For example,
loading the sample package from where it resides in github would
just be:
If DEBUG=True, memory usage will continuously increase during indexing, because Django stores
all db queries in memory, and a lot of them happen during indexing. Be wary of this during development
when indexing large databases, or on servers with small memory provisions (you may want to temporarily
set DEBUG=False).
Starting with version 7.4, you can add the -rd or --recalculate-descriptors flag to the reindex management command to force resource instance primary descriptors to be recalculated prior to reindexing. See below:
The path to the mapping file. The mapping file tells Arches how to
map the columns from your csv file to the nodes in your
resource graph. This option is required if there is not a
mapping file named the same as the business data file and in
the same directory with extension ‘.mapping’ instead of ‘.csv’
or ‘.json’.
-ow
Determines how resources with duplicate ResourceIDs will be
handled: append adds more tile data to an existing
resource; overwrite replaces any existing resource with
the imported data. This option only applies to CSV
import. JSON import always overwrites.
-bulk, --bulk_load
Bulk load values into the database. By setting this flag the
system will use Django’s bulk_create
operation. The model’s save() method will not be called,
and the pre_save and post_save signals will not be
sent.
--create_concepts
Creates or appends concepts and collections to your rdm
according to the option you select. create will create
concepts and collections and associate them to the mapped
nodes. append will append concepts to the existing
collections assigned to the mapped nodes and create
collections for nodes that do not have an assigned collection.
packages operation, in this case export_business_data
-d
Absolute path to destination directory
-f
Export format, must be csv or json
-c
(required for csv) Absolute path to the mapping file you would
like to use for your csv export.
-single_file
(optional for csv) Use this parameter if you’d like to export
your grouped data to the same csv file as the rest of your
data.
-g
(required for json, optional for csv) The resource model UUID
whose instances you would like to export.
Exports business data to csv or json depending on the -f parameter
specified. For csv export a mapping file is required. The exporter
will export all resources of the type indicated in the
resource_model_id property of the mapping file and the -g parameter
will be ignored. For json export no mapping file is required, instead
a resource model uuid should be passed into the -g command.
Note that in a Windows command prompt, you may need to replace ' with ".
Exports all business data of the resource model indicated in the
mapping file. Two files are created. The first file contains one row
per resource (if you resources all have the same geometry type this
file can be used to create a shape file in QGIS or other program). The
second file contains the grouped attributes of your resources (for
instance, alternate names, additional classifications, etc.).
Exports all business data of the passed in resource_model_id to the
specified file format. Take a look at the RESOURCE_FORMATERS
dictionary in Arches’ settings.py for some other interesting
options.
Imports a mapping file for a particular resource model. This will be
used as the export mapping file for a resource by default (e.g. for
search export).
Managing Functions, DataTypes, Widgets, and Card Components#
To learn how to build new Functions, DataTypes, Card Components, or Widgets,
please see Functions, Widgets, Card Components, or
Datatypes.
Note that when importing Widgets and associated DataTypes, Widgets
must be registered first.
All widget-related commands are identical to those for datatypes, just
substitute widget for datatype. Also note that where datatypes
are defined in .py files, widgets are defined in .json files.
All component-related commands are identical to those for widgets,
just substitute card_component for widget. JSON files are used
to register Card Components.
Run the Django dev server. Add 0.0.0.0:8000 to explicitly set the
host and port, which may be necessary when using remote servers, like
an AWS EC2 instance. More about runserver.
Collects all static files and places them in a single
directory. Generally only necessary in production. Also allows all
static files to be hosted on another server).
Django’s full manage.py commands are documented here.
Resources in an Arches database are separated into distinct Resource
Models designed to represent a kind of physical real-world resource,
such as a historic artifact or event. In the technical sense, the term
Resource Model refers collectively to the following user-facing
elements in Arches:
A Graph data structure representing a physical real-world resource,
such as a building, a public figure, a website, an archaeological
site, or a historic document.
A set of Cards to collect and display data associated with
instances of this Resource Model.
The relationships among these components and their dependencies are
visualized below:
The Arches logical model has been developed to support this modular
construction, and the relevant models are described below as they
pertain to the graph, UI components, and the resource data
itself (not illustrated above).
Note
In the UI you will see a distinction between “Resource Models” and
“Branches”, but underneath these are both made from instances of the Graph model. The primary difference between the two
is the isresource property, which is set to True for a
Resource Model.
Branches are used for records that might appear in multiple
Resource Models, such as a person or place. Branches can be
included as children of any Ontology-permitted Node in a Resource
Model.
Arches platform code defines base classes for some of its core data
models, and uses proxy models
to implement their controllers. In smaller classes, “controller” code
is included with the data model class. This documentation primarily
discusses the models, but controller behavior is discussed where
relevant to how the models are used, and all models are referred to by
their more succinct “controller” name.
ResourceInstance breaks the implicit naming convention above
because the term “Resource Model” refers to a specific Arches
construct, as explained in the Resource Model Overview above.
A Graph is a collection of NodeGroups, Nodes, and Edges which connect the Nodes.
Note
This definition does not include UI models and attributes, which
are discussed below.
In the Arches data model, Nodes represent their graph data structure
namesakes, sometimes called vertices. A Node does the work of
defining the Graph data structure in conjunction with one or more
Edges, and sometimes collecting data.
NodeGroups are an Arches feature used to represent a group of one or
more Nodes that collect data. NodeGroups can be nested, creating a
metadata structure which is used to display the graph in the UI and
collect related information together.
A NodeGroup exists for every Node that collects data, and both
contains and shares its UUID with that node (see
naming conventions for references). NodeGroups with more than
one member Node are used to collect composite or semantically-related
information. For example, a NodeGroup for a Node named Name.E1 may
contain a NameType.E55 Node. This way, a Graph with this
NodeGroup may store Names with multiple “types”, always collecting the
information together.
NodeGroups are used to create Cards, and this is done
based on the cardinality property. Therefore, not every NodeGroup
will be used to create a Card, which allows NodeGroups to exist within
other NodeGroups. The parentnodegroup property is used to record
this nesting.
A user-defined Function may be registered and then associated with a
Graph in order to extend the behavior of Arches. For more information,
see here.
classGraphModel(models.Model):graphid=models.UUIDField(primary_key=True,default=uuid.uuid1)name=models.TextField(blank=True,null=True)description=models.TextField(blank=True,null=True)deploymentfile=models.TextField(blank=True,null=True)author=models.TextField(blank=True,null=True)deploymentdate=models.DateTimeField(blank=True,null=True)version=models.TextField(blank=True,null=True)isresource=models.BooleanField()isactive=models.BooleanField()iconclass=models.TextField(blank=True,null=True)color=models.TextField(blank=True,null=True)subtitle=models.TextField(blank=True,null=True)ontology=models.ForeignKey('Ontology',db_column='ontologyid',related_name='graphs',null=True,blank=True)functions=models.ManyToManyField(to='Function',through='FunctionXGraph')jsonldcontext=models.TextField(blank=True,null=True)template=models.ForeignKey('ReportTemplate',db_column='templateid',default='50000000-0000-0000-0000-000000000001')config=JSONField(db_column='config',default={})@propertydefdisable_instance_creation(self):ifnotself.isresource:return_('Only resource models may be edited - branches are not editable')ifnotself.isactive:return_('Set resource model status to Active in Graph Designer')returnFalsedefis_editable(self):result=Trueifself.isresource:resource_instances=ResourceInstance.objects.filter(graph_id=self.graphid).count()result=Falseifresource_instances>0elseTrueifsettings.OVERRIDE_RESOURCE_MODEL_LOCK==True:result=TruereturnresultclassMeta:managed=Truedb_table='graphs'
classNode(models.Model):""" Name is unique across all resources because it ties a node to values within tiles. Recommend prepending resource class to node name. """nodeid=models.UUIDField(primary_key=True,default=uuid.uuid1)name=models.TextField()description=models.TextField(blank=True,null=True)istopnode=models.BooleanField()ontologyclass=models.TextField(blank=True,null=True)datatype=models.TextField()nodegroup=models.ForeignKey(NodeGroup,db_column='nodegroupid',blank=True,null=True)graph=models.ForeignKey(GraphModel,db_column='graphid',blank=True,null=True)config=JSONField(blank=True,null=True,db_column='config')issearchable=models.BooleanField(default=True)isrequired=models.BooleanField(default=False)sortorder=models.IntegerField(blank=True,null=True,default=0)defget_child_nodes_and_edges(self):""" gather up the child nodes and edges of this node returns a tuple of nodes and edges """nodes=[]edges=[]foredgeinEdge.objects.filter(domainnode=self):nodes.append(edge.rangenode)edges.append(edge)child_nodes,child_edges=edge.rangenode.get_child_nodes_and_edges()nodes.extend(child_nodes)edges.extend(child_edges)return(nodes,edges)@propertydefis_collector(self):returnstr(self.nodeid)==str(self.nodegroup_id)andself.nodegroupisnotNonedefget_relatable_resources(self):relatable_resource_ids=[r2r.resourceclassfromforr2rinResource2ResourceConstraint.objects.filter(resourceclassto_id=self.nodeid)]relatable_resource_ids=relatable_resource_ids+ \
[r2r.resourceclasstoforr2rinResource2ResourceConstraint.objects.filter(resourceclassfrom_id=self.nodeid)]returnrelatable_resource_idsdefset_relatable_resources(self,new_ids):old_ids=[res.nodeidforresinself.get_relatable_resources()]forold_idinold_ids:ifold_idnotinnew_ids:Resource2ResourceConstraint.objects.filter(Q(resourceclassto_id=self.nodeid)|Q(resourceclassfrom_id=self.nodeid),Q(resourceclassto_id=old_id)|Q(resourceclassfrom_id=old_id)).delete()fornew_idinnew_ids:ifnew_idnotinold_ids:new_r2r=Resource2ResourceConstraint.objects.create(resourceclassfrom_id=self.nodeid,resourceclassto_id=new_id)new_r2r.save()classMeta:managed=Truedb_table='nodes'
classEdge(models.Model):edgeid=models.UUIDField(primary_key=True,default=uuid.uuid1)# This field type is a guess.name=models.TextField(blank=True,null=True)description=models.TextField(blank=True,null=True)ontologyproperty=models.TextField(blank=True,null=True)domainnode=models.ForeignKey('Node',db_column='domainnodeid',related_name='edge_domains')rangenode=models.ForeignKey('Node',db_column='rangenodeid',related_name='edge_ranges')graph=models.ForeignKey('GraphModel',db_column='graphid',blank=True,null=True)classMeta:managed=Truedb_table='edges'unique_together=(('rangenode','domainnode'),)
classFunction(models.Model):functionid=models.UUIDField(primary_key=True,default=uuid.uuid1)# This field type is a guess.name=models.TextField(blank=True,null=True)functiontype=models.TextField(blank=True,null=True)description=models.TextField(blank=True,null=True)defaultconfig=JSONField(blank=True,null=True)modulename=models.TextField(blank=True,null=True)classname=models.TextField(blank=True,null=True)component=models.TextField(blank=True,null=True,unique=True)classMeta:managed=Truedb_table='functions'@propertydefdefaultconfig_json(self):json_string=json.dumps(self.defaultconfig)returnjson_stringdefget_class_module(self):mod_path=self.modulename.replace('.py','')module=Noneimport_success=Falseimport_error=Noneforfunction_dirinsettings.FUNCTION_LOCATIONS:try:module=importlib.import_module(function_dir+'.%s'%mod_path)import_success=TrueexceptImportErrorase:import_error=eifmodule!=None:breakifimport_success==False:print('Failed to import '+mod_path)print(import_error)func=getattr(module,self.classname)returnfunc
An ontology standardizes a set of valid CRM (Conceptual Reference
Model) classes for Node instances, as well as a set of relationships
that will define Edge instances. Most importantly, an ontology
enforces which Edges can be used to connect which Nodes. If a
pre-loaded ontology is designated for a Graph instance, every
NodeGroup within that Graph must conform to that ontology. You
may also create an “ontology-less” graph, which will not define
specific CRM classes for the Nodes and Edges.
These rules are stored as OntologyClass instances, which are stored
as JSON. These JSON objects consist of dictionaries with two
properties, down and up, each of which contains another two
properties ontology_property and ontology_classes (down assumes
a known domain class, while up assumes a known range class).
Aches comes preloaded with the CIDOC CRM, an ontology created by ICOM
(International Council of Museums) to model cultural heritage
documentation. However, a developer may create and load an entirely
new ontology.
classOntologyClass(models.Model):""" the target JSONField has this schema: values are dictionaries with 2 properties, 'down' and 'up' and within each of those another 2 properties, 'ontology_property' and 'ontology_classes' "down" assumes a known domain class, while "up" assumes a known range class .. code-block:: python "down":[ { "ontology_property": "P1_is_identified_by", "ontology_classes": [ "E51_Contact_Point", "E75_Conceptual_Object_Appellation", "E42_Identifier", "E45_Address", "E41_Appellation", .... ] } ] "up":[ "ontology_property": "P1i_identifies", "ontology_classes": [ "E51_Contact_Point", "E75_Conceptual_Object_Appellation", "E42_Identifier" .... ] } ] """ontologyclassid=models.UUIDField(default=uuid.uuid1,primary_key=True)source=models.TextField()target=JSONField(null=True)ontology=models.ForeignKey('Ontology',db_column='ontologyid',related_name='ontologyclasses')classMeta:managed=Truedb_table='ontologyclasses'unique_together=(('source','ontology'),)
The RDM (Reference Data Manager) stores all of the vocabularies used
in your Arches installation. Whether they are simple wordlists or a
polyhierarchical thesauri, these vocabularies are stored as “concept
schemes” and can be viewed as an aggregation of one or more concepts and the semantic relationships (links) between those
concepts.
In the data model, a concept scheme consists of a set of Concept
instances, each paired with a Value. In our running name/name_type
example, the NameType.E55 Node would be linked to a Concept
(NameType.E55) which would have two child Concepts. Thus, where
the user sees a dropdown containing “Primary” and “Alternate”, these
are actually the Values of NameType.E55’s two descendent
Concepts. The parent/child relationships between Concepts are stored
as Relation instances.
classConcept(models.Model):conceptid=models.UUIDField(primary_key=True,default=uuid.uuid1)# This field type is a guess.nodetype=models.ForeignKey('DNodeType',db_column='nodetype')legacyoid=models.TextField(unique=True)classMeta:managed=Truedb_table='concepts'
classRelation(models.Model):conceptfrom=models.ForeignKey(Concept,db_column='conceptidfrom',related_name='relation_concepts_from')conceptto=models.ForeignKey(Concept,db_column='conceptidto',related_name='relation_concepts_to')relationtype=models.ForeignKey(DRelationType,db_column='relationtype')relationid=models.UUIDField(primary_key=True,default=uuid.uuid1)# This field type is a guess.classMeta:managed=Truedb_table='relations'unique_together=(('conceptfrom','conceptto','relationtype'),)
classValue(models.Model):valueid=models.UUIDField(primary_key=True,default=uuid.uuid1)# This field type is a guess.concept=models.ForeignKey('Concept',db_column='conceptid')valuetype=models.ForeignKey(DValueType,db_column='valuetype')value=models.TextField()language=models.ForeignKey(DLanguage,db_column='languageid',blank=True,null=True)classMeta:managed=Truedb_table='values'
Three models are used to store Arches business data:
ResourceInstance - one per resource in the database
Tile - stores all business data
ResourceXResource - records relationships between resource instances
Creating a new resource in the database instantiates a new
ResourceInstance, which belongs to one resource model and has a
unique resourceinstanceid. A resource instance may also have its
own security/permissions properties in order to allow a fine-grained
level of user-based permissions.
Once data have been captured, they are stored as Tiles in the
database. Each Tile stores one instance of all of the attributes of a
given NodeGroup for a resource instance, as referenced by the
resourceinstanceid. This business data is stored as a JSON object,
which is a dictionary with n number of keys/value pairs that represent
a Node’s id nodeid and that Node’s value.
(In keeping with our running example, the keys in the second example
would refer to an Name.E1 node and an NameType.E55 node,
respectively.)
Arches also allows for the creation of relationships between resource
instances, and these are stored as instances of the
ResourceXResource model. The resourceinstanceidfrom and
resourceinstanceidto fields create the relationship, and
relationshiptype qualifies the relationship. The latter must
correspond to the appropriate top node in the RDM. This constrains the
list of available types of relationships available between resource
instances.
classResourceInstance(models.Model):resourceinstanceid=models.UUIDField(primary_key=True,default=uuid.uuid1)# This field type is a guess.graph=models.ForeignKey(GraphModel,db_column='graphid')legacyid=models.TextField(blank=True,unique=True,null=True)createdtime=models.DateTimeField(auto_now_add=True)classMeta:managed=Truedb_table='resource_instances'
classTileModel(models.Model):# Tile""" the data JSONField has this schema: values are dictionaries with n number of keys that represent nodeid's and values the value of that node instance .. code-block:: python { nodeid: node value, nodeid: node value, ... } { "20000000-0000-0000-0000-000000000002": "John", "20000000-0000-0000-0000-000000000003": "Smith", "20000000-0000-0000-0000-000000000004": "Primary" } the provisionaledits JSONField has this schema: values are dictionaries with n number of keys that represent nodeid's and values the value of that node instance .. code-block:: python { userid: { value: node value, status: "review", "approved", or "rejected" action: "create", "update", or "delete" reviewer: reviewer's user id, timestamp: time of last provisional change, reviewtimestamp: time of review } ... } { 1: { "value": { "20000000-0000-0000-0000-000000000002": "Jack", "20000000-0000-0000-0000-000000000003": "Smith", "20000000-0000-0000-0000-000000000004": "Primary" }, "status": "rejected", "action": "update", "reviewer": 8, "timestamp": "20180101T1500", "reviewtimestamp": "20180102T0800", }, 15: { "value": { "20000000-0000-0000-0000-000000000002": "John", "20000000-0000-0000-0000-000000000003": "Smith", "20000000-0000-0000-0000-000000000004": "Secondary" }, "status": "review", "action": "update", } """tileid=models.UUIDField(primary_key=True,default=uuid.uuid1)# This field type is a guess.resourceinstance=models.ForeignKey(ResourceInstance,db_column='resourceinstanceid')parenttile=models.ForeignKey('self',db_column='parenttileid',blank=True,null=True)data=JSONField(blank=True,null=True,db_column='tiledata')# This field type is a guess.nodegroup=models.ForeignKey(NodeGroup,db_column='nodegroupid')sortorder=models.IntegerField(blank=True,null=True,default=0)provisionaledits=JSONField(blank=True,null=True,db_column='provisionaledits')# This field type is a guess.classMeta:managed=Truedb_table='tiles'defsave(self,*args,**kwargs):if(self.sortorderisNoneor(self.provisionaleditsisnotNoneandself.data=={})):sortorder_max=TileModel.objects.filter(nodegroup_id=self.nodegroup_id,resourceinstance_id=self.resourceinstance_id).aggregate(Max('sortorder'))['sortorder__max']self.sortorder=sortorder_max+1ifsortorder_maxisnotNoneelse0super(TileModel,self).save(*args,**kwargs)# Call the "real" save() method.
classResourceXResource(models.Model):resourcexid=models.UUIDField(primary_key=True,default=uuid.uuid1)# This field type is a guess.resourceinstanceidfrom=models.ForeignKey('ResourceInstance',db_column='resourceinstanceidfrom',blank=True,null=True,related_name='resxres_resource_instance_ids_from')resourceinstanceidto=models.ForeignKey('ResourceInstance',db_column='resourceinstanceidto',blank=True,null=True,related_name='resxres_resource_instance_ids_to')notes=models.TextField(blank=True,null=True)relationshiptype=models.TextField(blank=True,null=True)datestarted=models.DateField(blank=True,null=True)dateended=models.DateField(blank=True,null=True)created=models.DateTimeField()modified=models.DateTimeField()defdelete(self):fromarches.app.search.search_engine_factoryimportSearchEngineFactoryse=SearchEngineFactory().create()se.delete(index='resource_relations',doc_type='all',id=self.resourcexid)super(ResourceXResource,self).delete()defsave(self):fromarches.app.search.search_engine_factoryimportSearchEngineFactoryse=SearchEngineFactory().create()ifnotself.created:self.created=datetime.datetime.now()self.modified=datetime.datetime.now()document=model_to_dict(self)se.index_data(index='resource_relations',doc_type='all',body=document,idfield='resourcexid')super(ResourceXResource,self).save()classMeta:managed=Truedb_table='resource_x_resource'
A number of models exist specifically to support the resource model
UI. The purpose of this is to create direct relationships between the
resource graph and the data entry cards that are used to create
resource instances. Generally, the process works like this:
A resource graph is an organized collection of NodeGroups which
define what information will be gathered for a given resource
model.
A resource’s Cards and are tied to specific
NodeGroups and define which input Widgets will be used
to gather values for each Node in that NodeGroup. Card Components are used to render the cards in various contexts
in the Arches UI.
Cards are UI representations of a NodeGroup, and they encapsulate the
Widgets that facilitate data entry for each Node in a given NodeGroup
instance.
While a Card will only handle data entry for a single NodeGroup (which
may have many Nodes or NodeGroups), a single NodeGroup can be handled
by more than one Card.
Throughout the Arches UI, Card Components are used to render Cards in
both read-only and data entry contexts.
Note
Beginning in Arches 4.3, Card Components provide functionality
formerly provided by Forms, Menus, and Reports.
classCardModel(models.Model):cardid=models.UUIDField(primary_key=True,default=uuid.uuid1)# This field type is a guess.name=models.TextField(blank=True,null=True)description=models.TextField(blank=True,null=True)instructions=models.TextField(blank=True,null=True)cssclass=models.TextField(blank=True,null=True)helpenabled=models.BooleanField(default=False)helptitle=models.TextField(blank=True,null=True)helptext=models.TextField(blank=True,null=True)nodegroup=models.ForeignKey('NodeGroup',db_column='nodegroupid')graph=models.ForeignKey('GraphModel',db_column='graphid')active=models.BooleanField(default=True)visible=models.BooleanField(default=True)sortorder=models.IntegerField(blank=True,null=True,default=None)component=models.ForeignKey('CardComponent',db_column='componentid',default=uuid.UUID('f05e4d3a-53c1-11e8-b0ea-784f435179ea'),on_delete=models.SET_DEFAULT)config=JSONField(blank=True,null=True,db_column='config')defis_editable(self):result=Truetiles=TileModel.objects.filter(nodegroup=self.nodegroup).count()result=Falseiftiles>0elseTrueifsettings.OVERRIDE_RESOURCE_MODEL_LOCK==True:result=TruereturnresultclassMeta:managed=Truedb_table='cards'
classWidget(models.Model):widgetid=models.UUIDField(primary_key=True,default=uuid.uuid1)# This field type is a guess.name=models.TextField(unique=True)component=models.TextField(unique=True)defaultconfig=JSONField(blank=True,null=True,db_column='defaultconfig')helptext=models.TextField(blank=True,null=True)datatype=models.TextField()@propertydefdefaultconfig_json(self):json_string=json.dumps(self.defaultconfig)returnjson_stringclassMeta:managed=Truedb_table='widgets'
Throughout the code, you will sometimes see an entity name with “id”
appended and other times see the same name with “_id” appended. For
example, you’ll see both nodegroupid and nodegroup_id.
What is the difference?
The first, nodegroupid, is a UUID attribute in the database and is
the primary key for entities of type NodeGroup.
The second, nodegroup_id, is a foreign key attribute (thus also a
UUID) that refers from somewhere else to a NodeGroup. For example, a
Tile object may have an associated NodeGroup; that NodeGroup object
itself would be referenced as tile.nodegroup, and the NodeGroup’s
UUID – which in the context of a Tile object is a foreign key –
would therefore be tile.nodegroup_id.
The reason to use tile.nodegroup_id, instead of getting the
NodeGroup’s ID by going through the associated NodeGroup object with
tile.nodegroup.nodegroupid, is that the latter would involve an
extra database query to fetch the NodeGroup instance, which would be a
waste if you don’t actually need the NodeGroup itself. When all you
need is the NodeGroup’s UUID – perhaps because you’re just going to
pass it along to something else that only needs the UUID – then
there’s no point fetching the entire NodeGroup when you already have
the Tile in hand and the Tile’s nodegroup_id field is a foreign
key to the Tile’s associated NodeGroup. You might as well just get
that foreign key, tile.nodegroup_id, directly.
Arches provides methods for importing data in a few different formats. Generally, you are placing the values you want to import into a structured file. The form that each value takes, depends on the data type of its target node.
Be aware that the graph-based structure of Resource Models in Arches means that your data must be carefully prepared before import, to ensure that branches, groupings, and cardinality is maintained. The method for doing this is determined by which file format you decide to use. Additionally, the data type of the target node for each value in your file will dictate that value’s format.
Nodes in your target resource model will have a specific datatype defined for each one (see Core Arches Datatypes), and it is very important that you format your input data accordingly. Below is a list of all core datatypes and how they should look in your import files.
Arches supports level 2 of the EDTF specification. However, because of a bug in the edtf package used by Arches,
an error will be thrown for strings like:
In CSV/SHP, if the values in your concept collection are unique you can use the label (prefLabel) for a concept. If not, you will get an error during import and you must use UUIDs instead of labels (if this happens, see Concepts File below):
Slate2995daea-d6d3-11e8-9eb1-0242ac150004
If a prefLabel has a comma in it, it must be triple-quoted:
In CSV/SHP, simply use the file name, or a single-quoted list of file names:
BuildingPicture.jpg
See the note below about where to prepopulate this file on your server, if you are not uploading it through the package load operation.
In JSON, you must include a more robust definition of the file that looks like this (and remember, this must be a list, even if you only have one file per node):
You should be able to generate this content by doing the following:
Pregenerate a new UUID for each file
Place this UUID in the file_id property, and also use it in the url property as shown above.
Select a renderer from settings.RENDERERS (see settings.py) and use its id for the renderer property. At the time of this writing, use 5e05aa2e-5db0-4922-8938-b4d2b7919733 for images (jpg, png, etc.) and 09dec059-1ee8-4fbd-85dd-c0ab0428aa94 for PDFs.
Set the type as appropriate–image/jpeg, image/png, application/pdf, etc.
Note
The file(s) should already exist in the uploadedfiles/ directory prior to loading the resource, but technically can be added later as well. This directory should be located within your MEDIA_ROOT location. For example, by default, Arches sets MEDIA_ROOT=os.path.join(ROOT_DIR). This means you should find (or create if it doesn’t exist) my_project/uploadedfiles, alongside manage.py.
resourceId (required) - the target resource-instance ResourceID
ontologyProperty (can be left blank) - the URL of the ontology property that defines the relationship to the target resource-instance
resourceXresourceId (can be left blank) - the system will assign a UUID for this relationship
inverseOntologyProperty (can be left blank) - the URL of the ontology property that defines the inverse of the relationship referenced under ontologyProperty
In CSV/SHP, same as above, except repeating each resource-instance within the square brackets (i.e. “[{first resource-instance},{second resource-instance}]” ):
One method of bulk loading data into Arches is to create a CSV (comma separated values) file. We recommend using MS Excel or Open Office for this task. More advanced users will likely find a custom scripting effort to be worthwhile.
Note
Your CSV should be encoded into UTF-8. These steps will help you if you are using MS Excel.
The workflow for creating a CSV should be something like this:
Identify which Resource Model you are loading data into
Download the mapping file and concepts file for that resource model
Each row in the CSV can contain the attribute values of one and only one resource.
The first column in the CSV must be named ResourceID. ResourceID is a user-generated unique ID for each individual resource. If ResourceID is a valid UUID, Arches will adopt it internally as the new resource’s identifier. If ResourceID is not a valid UUID Arches will create a new UUID and use that as the resource’s identifier. Subsequent columns can have any name.
ResourceIDs must be unique among all resources imported, not just within each csv, for this reason we suggest using UUIDs.
ResourceID
attribute 1
attribute 2
attribute 3
1
attr. 1 value
attr. 2 value
attr. 3 value
2
attr. 1 value
attr. 2 value
attr. 3 value
3
attr. 1 value
attr. 2 value
attr. 3 value
Simple CSV with three resources, each with three different attributes.
Or, in a raw format (if you open the file in a text editor), the CSV should look like this:
Multiple lines may be used to add multiple attributes to a single resource. You must make sure these lines are contiguous, and every line must have a ResourceID. Other cells are optional.
ResourceID
attribute 1
attribute 2
attribute 3
1
attr. 1 value
attr. 2 value
attr. 3 value
2
attr. 1 value
attr. 2 value
attr. 3 value
2
attr. 2 additional value
3
attr. 1 value
attr. 2 value
attr. 3 value
CSV with three resources, one of which has two values for attribute 2.
Depending on your Resource Model’s graph structure, some attributes will be handled as “groups”. For example, Name and NameType attributes would be a group. Attributes that are grouped must be on the same row. However, a single row can have many different groups of attributes in it, but there may be only one of each group type per row. (e.g. you cannot have two names and two name types in one row).
ResourceID
name
name_type
description
1
Yucca House
Primary
“this house, built in…”
2
Big House
Primary
originally a small cabin
2
Old Main Building
Historic
3
Writer’s Cabin
Primary
housed resident authors
CSV with three resources, one of which has two groups ofnameandname_typeattributes. Note that “Primary” and “Historic” are the prefLabels for two different concepts in the RDM.
You must have values for any required nodes in your resource models.
Note
If you are using MS Excel to create your CSV files, double-quotes will automatically be added to any cell value that contains a comma.
All CSV files must be accompanied by a mapping file. This is a JSON-structured file that indicates which node in a Resource Model’s graph each column in the CSV file should map to. The mapping file should contain the source column name populated in the file_field_name property for all nodes in a graph the user wishes to map to. The mapping file should be named exactly the same as the CSV file but with the extension ‘.mapping’, and should be in the same directory as the CSV.
To create a mapping file for a Resource Model in your database, go to the Arches Designer landing page. Find the Resource Model into which you plan to load resources, and choose Export Mapping File from the Manage menu.
Unzip the download, and you’ll find a .mapping file as well as a _concepts.json file (see Concepts File). The contents of the mapping file will look something like this:
The mapping file contains cursory information about the resource model (name and resource model id) and a listing of the nodes that compose that resource model. Each node contains attributes to help you import your business data (not all attributes are used on import, some are there simply to assist you). The concept_export_value attribute is only present for nodes with datatypes of concept, concept-list, domain, and domain-list - this attribute is not used for import. It is recommended that you not delete any attributes from the mapping file. If you do not wish to map to a specfic node simply set the file_field_name attribute to "".
You will now need to enter the column name from your CSV into the file_field_name in appropriate node in the mapping file. For example, if your CSV has a column named “activity_type” and you want the values in this column to populate “Activity Type” nodes in Arches, you would add that name to the mapping file like so:
When populating concept nodes from a CSV you should generally use the prefLabel for that concept. However, in rare instances there may be two or more concepts in your collection that have identical prefLabels (this is allowed in Arches). In this case you will need to replace the prefLabel in your CSV with the UUID for the Value that represents that prefLabel.
To aid with the process, a “concepts file” is created every time you download a mapping file, which lists the valueids and corresponding labels for all of the concepts in all of the concept collections associated with any of the Resource Model’s nodes. For example:
Uploading a shapefile to Arches is very similar to uploading a CSV file with a few exceptions. The same rules apply to rich text, concept data, grouped data, and contiguousness. And, like CSV import, shapefile import requires a mapping file. Note that in this mapping file, the node you wish to map the geometry to must have a file_field_name value of ‘geom’.
Other Requirements:
The shapefile must contain a field with a unique identifier for each resource named ‘ResourceID’.
The shapefile must be in WGS 84 (EPSG:4326) decimal degrees.
The shapefile must consist of at least a .shp, .dbf, .shx, and .prj file. It may be zipped or unzipped.
Dates in a shapefile can be in ESRI Shapefile date format, Arches will convert them to the appropriate date format. They can also be strings stored in YYYY-MM-DD format.
Note
More complex geometries may encounter a mapping_parser_exception error. This error occurs when a geometry is not valid in elasticsearch. To resolve this, first make sure your geometry is valid using ArcMap, QGIS, or PostGIS. Next, you can modify the precision of your geometry to 5 decimals or you can simplify your geometry using the QGIS simplify geometry geoprocessing tool, or the PostGIS st_snaptogrid function.
JSON import of business data is primarily intended for transferring business data between arches instances. Because of this it’s not especially user-friendly to create or interpret the JSON data format, but doing so is not impossible.
First, there are at least two ways you can familiarize yourself with the format. The system settings in an Arches package is stored in this json format, you can open one of those up and take a look. Perhaps a better way in your case is to create some business data via the ui in your instance of arches and export it to the json format using the business data export command defined here Export Commands. This can act as a template json for data creation. For the rest of this section it may be helpful to have one of these files open to make it easier to follow along.
The json format is primarily a representation of the tiles table in the arches postgres database with some information about the resource instance(s) included. Within the business_data object of the json are two objects, the tiles object and the resourceinstance object. Let’s start with the resource instance object.
graph_id - the id of the resource model for which this data was created
resourceinstanceid - the unique identifier of this resource instance within Arches (this will need to be unique for every resource in Arches)
legacyid - an identifier that was used for this resource before its inclusion in Arches. This can be the same as the resourceinstanceid (this is the case when you provide a UUID to the ResourceID column in a CSV) or it can be another id. Either way it has to be unique among every resource in Arches.
The tiles object is a list of tiles that compose a resource instance. The tiles object is a bit more complicated than the resourceinstance object, and the structure can vary depending on the cardinality of your nodes. The following cardinality examples will be covered below:
tileid - unique identifier of the tile this is the primary key in the tiles table and must be a unique uuid
resourceinstance_id - the uuid corresponding to the instance this tile belongs to (this should be the same as the resourceinstance_id from the resourceinstance object.
nodegroup_id - the node group for which the nodes within the data array participate
sortorder - the sort order of this data in the form/report relative to other tiles (only applicable if cardinality is n)
parenttile_id - unique identifier of the parenttile of this tile (will be null if this is a parent tile or the tile has no parent)
data - json structure of a node group including the nodeid and data populating that node. For example:
{"data":{"<uuid for building name node>":"Smith Cottage"}}
The tile object is tied to a resource model in two ways: 1) through the nodegroup_id 2) in the data object where nodeids are used as keys for the business data itself.
Now for a detailed look at the actual contents of tiles. Note that below we are using simplified values for tileid, like "A" and "B", to clearly illustrate parent/child relationships. In reality these must be valid UUIDs.
1: There is one and only one instance of this nodegroup/card in a resource:
[{"tileid":"A","resourceinstance_id":"<uuid from resourceinstance.resourceinstanceid>","nodegroup_id":"<uuid from resource model>","sortorder":0,"parenttile_id":null,"data":{"nodeid":"some data","nodeid":"some other data"}}]
This structure represents a tile for a nodegroup (consisting of two nodes) with no parents collecting data with a cardinality of 1.
n: There are multiple instances of this nodegroup/card in a resource:
[{"tileid":"A","resourceinstance_id":"<uuid from resourceinstance.resourceinstanceid">,"nodegroup_id":"<uuid from resource model">,"sortorder":0,"parenttile_id":null,"data":{"nodeid":"some data","nodeid":"some other data"}},{"tileid":"B","resourceinstance_id":"<uuid from resourceinstance.resourceinstanceid>","nodegroup_id":"<uuid from resource model>","sortorder":0,"parenttile_id":null,"data":{"nodeid":"more data","nodeid":"more other data"}}]
1-1: One and only one parent nodegroup/card contains one and only one child nodegroup/card:
[{"tileid":"A","resourceinstance_id":"<uuid from resourceinstance.resourceinstanceid>","nodegroup_id":"<uuid from resource model>","sortorder":0,"parenttile_id":null,"data":{}},{"tileid":"X","resourceinstance_id":"<uuid from resourceinstance.resourceinstanceid>","nodegroup_id":"<uuid from resource model>","sortorder":0,"parenttile_id":"A","data":{"nodeid":"data","nodeid":"other data"}}]
1-n: One and only one parent nodegroup/card containing multiple instances of child nodegroups/cards:
[{"tileid":"A","resourceinstance_id":"<uuid from resourceinstance.resourceinstanceid>","nodegroup_id":"<uuid from resource model>","sortorder":0,"parenttile_id":null,"data":{}},{"tileid":"X","resourceinstance_id":"<uuid from resourceinstance.resourceinstanceid>","nodegroup_id":"<uuid from resource model>","sortorder":0,"parenttile_id":"A","data":{"nodeid":"data","nodeid":"other data"}},{"tileid":"Y","resourceinstance_id":"<uuid from resourceinstance.resourceinstanceid>","nodegroup_id":"<uuid from resource model>","sortorder":0,"parenttile_id":"A","data":{"nodeid":"more data","nodeid":"more other data"}}]
n-1: Many parent nodegroups/cards each with one child nodegroup/card:
[{"tileid":"A","resourceinstance_id":"<uuid from resourceinstance.resourceinstanceid>","nodegroup_id":"<uuid from resource model>","sortorder":0,"parenttile_id":null,"data":{}},{"tileid":"X","resourceinstance_id":"<uuid from resourceinstance.resourceinstanceid>","nodegroup_id":"<uuid from resource model>","sortorder":0,"parenttile_id":"A","data":{"nodeid":"data","nodeid":"other data"}},{"tileid":"B","resourceinstance_id":"<uuid from resourceinstance.resourceinstanceid>","nodegroup_id":"<uuid from resource model>","sortorder":0,"parenttile_id":null,"data":{}},{"tileid":"Y","resourceinstance_id":"<uuid from resourceinstance.resourceinstanceid>","nodegroup_id":"<uuid from resource model>","sortorder":0,"parenttile_id":"B","data":{"nodeid":"more data","nodeid":"more other data"}}]
n-n: Many parent nodegroups/cards containing many child nodegroups/cards:
[{"tileid":"A","resourceinstance_id":"<uuid from resourceinstance.resourceinstanceid>","nodegroup_id":"<uuid from resource model>","sortorder":0,"parenttile_id":null,"data":{}},{"tileid":"X","resourceinstance_id":"<uuid from resourceinstance.resourceinstanceid>","nodegroup_id":"<uuid from resource model>","sortorder":0,"parenttile_id":"A","data":{"nodeid":"data","nodeid":"other data"}},{"tileid":"B","resourceinstance_id":"<uuid from resourceinstance.resourceinstanceid>","nodegroup_id":"<uuid from resource model>","sortorder":0,"parenttile_id":null,"data":{}},{"tileid":"Y","resourceinstance_id":"<uuid from resourceinstance.resourceinstanceid>","nodegroup_id":"<uuid from resource model>","sortorder":0,"parenttile_id":"B","data":{"nodeid":"more data","nodeid":"more other data"}},{"tileid":"Z","resourceinstance_id":"<uuid from resourceinstance.resourceinstanceid>","nodegroup_id":"<uuid from resource model>","sortorder":0,"parenttile_id":"B","data":{"nodeid":"even more data","nodeid":"even more other data"}}]
It is possible to batch import Resource Relations (also referred to as “resource-to-resource relationships”). To do so, create a .relations file (a CSV-formatted file with a
.relations extension). The header of the file should be as follows:
In each row, resourceinstanceidfrom and resourceinstanceidto must either be an Arches ID (the UUID assigned to a new resource when it is first created) or a Legacy ID (an identifier from a legacy database that was used as a ResourceID in a JSON or CSV import file).
You can find the UUID value for your desired relationshiptype in the concept.json file downloaded with your resource model mapping file.
datestarted, dateended and notes are optional fields. Dates should be formatted YYYY-MM-DD.
Once constructed you can import the .relations file with the following command:
All the resources referenced in the .relations CSV need to already be in your database. So make sure to run this command after you have imported all the business data referenced in the .relations file.
Note
You can also create relationships between resources using the resource-instance data type. When you are making the graph for a new resource model, you can set one of the nodes to hold a resource instance. This is not the same as creating Resource Relations as described above.
Arches provides database functions that are meant to assist with the loading, updating and querying of Arches business data via SQL. This strategy is especially useful if you are migrating an existing SQL database into Arches.
SQL import is more flexible and faster than loading via CSV, however it requires some SQL skills to write scripts to interact with these data.
The core functions that arches provides allow for flexible, on-demand creation of view entities that create relational database entities representing Arches graph schema in the form of database views. These database views can be queried using SQL, including INSERT, UPDATE, and DELETE operations.
Creates a view representing a specific nodegroup in the Arches graph schema. The resultant view can be queried using SQL including INSERT, UPDATE, and DELETE operations. If no view name is provided, then the function will attempt to create a view with the name of the nodegroup’s root node processed to be suitable for a database entity name (for example, spaces replaced with underscores).
Arguments
group_id:
uuid - the UUID of the nodegroup for which a view will be created.
view_name:
text (optional) - the name to be used for the view being created, defaults to null
schema_name:
text (optional) - the name of the schema to which the new view will be added, defaults to ‘public’
parent_name:
text (optional) - name used for column containing the parent tile id, defaults to ‘parenttileid’
Returns
returns:
text - message indicating success and name of the view created.
Creates a series of views (using the above __arches_create_nodegroup_view function) representing a specific nodegroup and all of its child nodegroups (recursively) in the Arches graph schema.
Arguments
group_id:
uuid - the UUID of the nodegroup for which views will be created recursively.
schema_name:
text (optional) - the name of the schema to which the new views will be added, defaults to ‘public’
(Drops if it exists and) creates a schema and a view representing the instances of a specific resource model and series of views (using the above __arches_create_nodegroup_view function) for each of its nodegroups in the Arches graph schema. If no schema name is provided, then the function will attempt to create a schema with the name of resource model processed to be suitable for a database entity name (for example, spaces replaced with underscores).
Arguments
model_id:
uuid - the UUID of the resource model for which views will be created.
schema_name:
text (optional) - the name of the schema to which the new views will be added, defaults to null
Returns
returns:
text - message indicating success and the name of the schema created.
In addition to the functions that create views, the helper functions are also available to assist in the creation of tile data using the created views.
Returns the node id for a given view column. This is useful for subsequently looking up additional information about a column/node in the Arches graph schema, for example, creating a lookup table of concepts for a particular column/node.
Arguments
schema_name:
text - the name of the schema that contains the view of interest
For a hypothetical example, consider a table in your legacy database called buildings with a name and resourceid columns. The following could be used to migrate the rows into new Arches resource instances.
Let’s assume we have a Resource Model called “Architectural Resource”, and it has two nodes, “Name” and “Name Type”, under a single semantic node “Names”.
Use the __arches_create_resource_model_views function (see above) to create a new schema for each active Resource Model.
SELECT__arches_create_resource_model_views(graphid)FROMgraphsWHEREisactive=trueANDname!='Arches System Settings';
In our case, the result will be a new schema called architectural_resource and a table called names (named for the node furthest up the hierarchy in the nodegroup, in this case, a semantic node).
Directly inserting our records into the new Arches view will look something like this:
In this case, “Primary” is being given to every name type, because your legacy database did not have more than one name per resource.
Todo
A second table may need to be populated here too, to register the instances themselves.
Warning
This SQL method for inserting records has a known and severe performance issue for Postgres/PostGIS instances installed on Ubuntu, Debian, and Alpine operating systems. On these operating systems, a node-instance data insert of only a few thousand records may result in a database connection time out error (see issue discussion here: archesproject/arches#9049).
This issue is known to impact Arches versions 7.x. A fix for this OS related issue will likely come with Arches version 7.5. If you are using a version of Arches impacted by this issue, you can use the following workaround to vastly (perhaps 50x) improve the performance of the SQL method for inserts. Execute the following SQL BEFORE you run SQL inserts:
create or replace function __arches_tile_view_update() returns trigger as $$
declare
view_namespace text;
group_id uuid;
graph_id uuid;
parent_id uuid;
tile_id uuid;
transaction_id uuid;
json_data json;
old_json_data jsonb;
edit_type text;
begin
select graphid into graph_id from nodes where nodeid = group_id;
view_namespace = format('%s.%s', tg_table_schema, tg_table_name);
select obj_description(view_namespace::regclass, 'pg_class') into group_id;
if (TG_OP = 'DELETE') then
select tiledata into old_json_data from tiles where tileid = old.tileid;
delete from resource_x_resource where tileid = old.tileid;
delete from public.tiles where tileid = old.tileid;
insert into bulk_index_queue (resourceinstanceid, createddate)
values (old.resourceinstanceid, current_timestamp) on conflict do nothing;
insert into edit_log (
resourceclassid,
resourceinstanceid,
nodegroupid,
tileinstanceid,
edittype,
oldvalue,
timestamp,
note,
transactionid
) values (
graph_id,
old.resourceinstanceid,
group_id,
old.tileid,
'tile delete',
old_json_data,
now(),
'loaded via SQL backend',
public.uuid_generate_v1mc()
);
return old;
else
select __arches_get_json_data_for_view(new, tg_table_schema, tg_table_name) into json_data;
select __arches_get_parent_id_for_view(new, tg_table_schema, tg_table_name) into parent_id;
tile_id = new.tileid;
if (new.transactionid is null) then
transaction_id = public.uuid_generate_v1mc();
else
transaction_id = new.transactionid;
end if;
if (TG_OP = 'UPDATE') then
select tiledata into old_json_data from tiles where tileid = tile_id;
edit_type = 'tile edit';
if (transaction_id = old.transactionid) then
transaction_id = public.uuid_generate_v1mc();
end if;
update public.tiles
set tiledata = json_data,
nodegroupid = group_id,
parenttileid = parent_id,
resourceinstanceid = new.resourceinstanceid
where tileid = new.tileid;
elsif (TG_OP = 'INSERT') then
old_json_data = null;
edit_type = 'tile create';
if tile_id is null then
tile_id = public.uuid_generate_v1mc();
end if;
insert into public.tiles(
tileid,
tiledata,
nodegroupid,
parenttileid,
resourceinstanceid
) values (
tile_id,
json_data,
group_id,
parent_id,
new.resourceinstanceid
);
end if;
perform __arches_refresh_tile_resource_relationships(tile_id);
insert into bulk_index_queue (resourceinstanceid, createddate)
values (new.resourceinstanceid, current_timestamp) on conflict do nothing;
insert into edit_log (
resourceclassid,
resourceinstanceid,
nodegroupid,
tileinstanceid,
edittype,
newvalue,
oldvalue,
timestamp,
note,
transactionid
) values (
graph_id,
new.resourceinstanceid,
group_id,
tile_id,
edit_type,
json_data::jsonb,
old_json_data,
now(),
'loaded via SQL backend',
transaction_id
);
return new;
end if;
end;
$$ language plpgsql;
As part of this workaround, after you make any bulk updates or inserts to geometries, you’ll need execute the following:
All file-based business exports must happen through the command line interface. The output format can either be JSON (the best way to do a full dump of your Arches database) or CSV (a more curated way to export a specific subset of data). To use Arches data in other systems or export shapefiles, users will have to begin by creating a new resource database view (see below).
The output format can either be JSON (the best way to do a full dump of your Arches database) or CSV (a more curated way to export a specific subset of data).
Note that you’ll have to provide the UUID for the Resource Model whose resources you want to export. The easiest way to find this UUID is by looking at the browser url while editing the Resource Model in the Arches Designer UI.
When exporting to CSV, you need to use a Mapping File, which will determine the content of your CSV (which nodes are exported, etc.). Add the --single_file argument to export your grouped data to the same CSV file as the rest of your data.
More about these export commands can be found in Export Commands.
To export to spatial formats such as shapefile, it is necessary to flatten the graph structure of your resources. One way to do this is to create a database view of your resource models. Arches does not do this automatically because there are many ways to design a flattened table depending on your needs.
You can add any number of database views representing a given resource model either for export, or to connect directly to a GIS client such as QGIS or ArcGIS. When writing a view to support shapefile export be sure that your view does not violate any shapefile restrictions. For example, shapefile field names are limited to 10 characters with no special characters and text fields cannot store more than 255 characters.
If you plan to use the arches export command to export your view as a shapefile, you also need to be sure that your view contains 2 fields: geom with the geometry representing your resource instance’s location and geom_type with the postgis geometry type of your geom column.
To write your view, you should start by getting a mapping file for your resource. You can do that by going to the Arches Designer page and then in the manage dropdown of your resource model select Create Mapping File. A zip file will be downloaded and within that file you will find your .mapping file. This file lists all the ids that you will need to design your view.
Below is an example of a simple resource model view. If a resource instance has a tile with geojson saved to it, that tile will be represented as a record in the view along with the corresponding nodeid and tileid. A unique id (gid) is assigned to each row. If a node has more than one geometry, the geometries are combined into a multipart geometry. If a node has more than one geometry of different types, a record will be created for each type. The UUID (ab74af76-fa0e-11e6-9e3e-026d961c88e6) in the last line of this this example is the id of the view’s resource model.
When creating your own view, you will need to replace this UUID with your own resource model’s id. You can find this UUID in your mapping file assigned to the property: resource_model_id.
You will notice that for each node added as a column in the table, we perform a LEFT JOIN to the tiles table and the nodeid from which we want data. Here is an example joining to the tile containing the record node which has a nodeid of 677f2c0f-09cc-11e7-b412-6c4008b05c4c.
Starting with version 7.5, Arches moved to a new architectural pattern to support certain customization needs. This new pattern called Arches Applications (alternatively Arches Apps) should make customizations easier to develop and maintain. This architectural pattern also aligns with standard Django practices for the introduction of reusable sets of new features.
The phrase Arches application (or Arches app) describes a Python package that provides some set of features added to the core (standard) Arches application. Arches apps can be reused in multiple Arches projects. This terminology about applications and projects purposefully aligns with the Django definition and use of these terms.
Illustration of Arches projects integrating custom Arches Apps.#
Arches Apps provide a means to power special purpose features that may not be appropriate for incorporation into the core (standard) Arches application. A given Arches App can be under version control independent of core Arches. This should make it easier to update and upgrade core Arches independently of a custom Arches App (and vice versa).
A given Arches App can also be developed and shared open source. This means that the custom features powered by an Arches App can be reused widely across the community. Because Arches App development can proceed independently of core Arches, Arches Apps can be an excellent way for community members to experiment with features beyond those listed on the official Arches software development roadmap official Arches software development roadmap.
Arches for Science illustrates the value of Arches apps. Arches for Science has several workflows and features (together with additional software dependencies) useful for cultural heritage conservation science. However, these features would be unnecessary for many other core Arches use cases. Keeping these conservation science features in a distinct app allows Arches for Science software development to continue at its own pace, and it reduces pressures to add highly specialized features to core Arches. Arches apps can therefore help reduce the complexity and maintenance costs of core Arches.
Through Arches apps, desired special features can be added to an Arches instance without forking the core (standard) Arches application code. There are many advantages to avoiding forks of the core (standard) Arches application code. By avoiding forks, one can more easily take advantage of continued upgrades and security patches applied to core Arches. This makes your use of Arches easier to maintain and secure.
A given Arches App can also be developed and shared open source. This means that the custom features powered by an Arches App can be reused across the community in multiple Arches projects.
The Arches team created a simple example Arches app to illustrate how to develop and deploy custom apps. The example app called Arches Dashboard displays a summary count of resource instances and tiles in a given Arches project.
The Arches Dashboard app provides an example of how to build a custom Arches application. Experience with Django in general, and Django app development in particular, would be very useful for Arches app development. The official Django documentation provides a great starting tutorial for learning how to create apps.
There are a number of patterns in place to allow you to extend Arches. Extensions can be used to customize the data entry process, add custom display widgets to reports, or even define new types of data that Arches can store.
Beginning in Arches 4.3, Cards are rendered using Card Components,
allowing them to be composed and nested arbitrarily in various
contexts within the Arches UI. Arches comes with a default Card
Component that should suit most needs, but you can also create and
register custom Card Components to extend the front-end behavior of
Arches.
Before exploring how do make customized Cards, please review documentation
about available Card Types standard with Arches.
Developing Card Components is very similar to developing Widgets. A
Card Component consists of a Django template and Knockout.js
JavaScript file. To register your component, you’ll also need a JSON
file specifying its initial configuration.
To develop your new card, you’ll place files like so in your project:
The default template and Knockout files illustrate everything a Card
Component needs, and you’ll be extending this functionality. Your
template will provide conditional markup for various contexts
(‘editor-tree’, ‘designer-tree’, ‘permissions-tree’, ‘form’, and
‘report’), render all the card’s Widgets, and display other
information.
Here’s the template for the default Card Component:
To register your Component, you’ll need a JSON configuration file
looking a lot like this sample:
{"name":"My New Card","componentid":"eea17d6c-0c32-4536-8a01-392df734de1c","component":"/views/components/cards/my-new-card","componentname":"my-new-card","description":"An awesome new card that does wonderful things.","defaultconfig":{}}
componentid:
Optional A UUID4 for your Component. Feel free to generate
one in advance if that fits your workflow; if not, Arches will
generate one for you and print it to STDOUT when you register
the Component.
name:
Required The name of your new Card Component, visible in
the drop-down list of card components in the Arches Designer.
description:
Required A brief description of your component.
component:
Required The path to the component view you have
developed. Example: views/components/cards/sample-datatype
componentname:
Required Set this to the last part of component above.
defaultconfig:
Required You can provide user-defined default
configuration here. Make it a JSON dictionary of keys and
values. An empty dictionary is acceptable.
A DataType defines a type of business data. DataTypes are associated
with Nodes and Widgets. When you are designing your Cards, the Widgets
with the same DataType as the Node you are collecting data for will be
available. In your Branches, each Node with a DataType will honor the
DataType configuration you specify when you create it.
The simplest (non-configurable, non-searchable) DataTypes consist of a
single Python file. If you want to provide Node-specific configuration
to your DataType (such as whether to expose a Node with that DataType
to Advanced Search or how the data is rendered), you’ll also develop a
UI component comprising a Django template and JavaScript file.
In your Project, these files must be placed accordingly:
To begin, let’s examine the sample-datatype included with Arches:
1fromarches.app.datatypes.baseimportBaseDataType 2fromarches.app.modelsimportmodels 3fromarches.app.models.system_settingsimportsettings 4 5sample_widget=models.Widget.objects.get(name="sample-widget") 6 7details={ 8"datatype":"sample-datatype", 9"iconclass":"fa fa-file-code-o",10"modulename":"datatypes.py",11"classname":"SampleDataType",12"defaultwidget":sample_widget,13"defaultconfig":{"placeholder_text":""},14"configcomponent":"views/components/datatypes/sample-datatype",15"configname":"sample-datatype-config",16"isgeometric":False,17"issearchable":False,18}192021classSampleDataType(BaseDataType):22defvalidate(self,value,row_number=None,source=None):23errors=[]24try:25value.upper()26except:27errors.append(28{29"type":"ERROR",30"message":"datatype: {0} value: {1}{2}{3} - {4}. {5}".format(31self.datatype_model.datatype,value,row_number,source,"this is not a string","This data was not imported.",32),33}34)35returnerrors3637defappend_to_document(self,document,nodevalue,nodeid,tile):38document["strings"].append({"string":nodevalue,"nodegroup_id":tile.nodegroup_id})3940deftransform_export_values(self,value,*args,**kwargs):41ifvalue!=None:42returnvalue.encode("utf8")4344defget_search_terms(self,nodevalue,nodeid=None):45terms=[]46ifnodevalueisnotNone:47ifsettings.WORDS_PER_SEARCH_TERM==Noneor(len(nodevalue.split(" "))<settings.WORDS_PER_SEARCH_TERM):48terms.append(nodevalue)49returnterms5051defappend_search_filters(self,value,node,query,request):52try:53ifvalue["val"]!="":54match_type="phrase_prefix"if"~"invalue["op"]else"phrase"55match_query=Match(field="tiles.data.%s"%(str(node.pk)),query=value["val"],type=match_type,)56if"!"invalue["op"]:57query.must_not(match_query)58query.filter(Exists(field="tiles.data.%s"%(str(node.pk))))59else:60query.must(match_query)61exceptKeyError:62pass
Your DataType needs, at minimum, to implement the validate
method. You’re also likely to implement the
transform_import_values or transform_export_values
methods. Depending on whether your DataType is spatial, you may need
to implement some other methods as well. If you want to expose Nodes
of your DataType to Advanced Search, you’ll also need to implement the
append_search_filters method.
You can get a pretty good idea of what methods you need to implement
by looking at the BaseDataType class in the Arches source code
located at arches/app/datatypes/base.py and below:
1importjson 2fromdjango.core.urlresolversimportreverse 3fromarches.app.modelsimportmodels 4 5classBaseDataType(object): 6 7def__init__(self,model=None): 8self.datatype_model=model 9 10defvalidate(self,value,row_number=None,source=None): 11return[] 12 13defappend_to_document(self,document,nodevalue,nodeid,tile): 14""" 15 Assigns a given node value to the corresponding key in a document in 16 in preparation to index the document 17 """ 18pass 19 20defafter_update_all(self): 21""" 22 Refreshes mv_geojson_geoms materialized view after save. 23 """ 24pass 25 26deftransform_import_values(self,value,nodeid): 27""" 28 Transforms values from probably string/wkt representation to specified 29 datatype in arches 30 """ 31returnvalue 32 33deftransform_export_values(self,value,*args,**kwargs): 34""" 35 Transforms values from probably string/wkt representation to specified 36 datatype in arches 37 """ 38returnvalue 39 40defget_bounds(self,tile,node): 41""" 42 Gets the bounds of a geometry if the datatype is spatial 43 """ 44returnNone 45 46defget_layer_config(self,node=None): 47""" 48 Gets the layer config to generate a map layer (use if spatial) 49 """ 50returnNone 51 52defshould_cache(self,node=None): 53""" 54 Tells the system if the tileserver should cache for a given node 55 """ 56returnFalse 57 58defshould_manage_cache(self,node=None): 59""" 60 Tells the system if the tileserver should clear cache on edits for a 61 given node 62 """ 63returnFalse 64 65defget_map_layer(self,node=None): 66""" 67 Gets the array of map layers to add to the map for a given node 68 should be a dictionary including (as in map_layers table): 69 nodeid, name, layerdefinitions, isoverlay, icon 70 """ 71returnNone 72 73defclean(self,tile,nodeid): 74""" 75 Converts '' values to null when saving a tile. 76 """ 77iftile.data[nodeid]=='': 78tile.data[nodeid]=None 79 80defget_map_source(self,node=None,preview=False): 81""" 82 Gets the map source definition to add to the map for a given node 83 should be a dictionary including (as in map_sources table): 84 name, source (json) 85 """ 86tileserver_url=reverse('tileserver') 87ifnodeisNone: 88returnNone 89source_config={ 90"type":"vector", 91"tiles":["%s/%s/{z}/{x}/{y}.pbf"%(tileserver_url,node.nodeid)] 92} 93count=None 94ifpreview==True: 95count=models.TileModel.objects.filter(data__has_key=str(node.nodeid)).count() 96ifcount==0: 97source_config={ 98"type":"geojson", 99"data":{100"type":"FeatureCollection",101"features":[102{103"type":"Feature",104"properties":{105"total":1106},107"geometry":{108"type":"Point",109"coordinates":[110-122.4810791015625,11137.93553306183642112]113}114},115{116"type":"Feature",117"properties":{118"total":100119},120"geometry":{121"type":"Point",122"coordinates":[123-58.30078125,124-18.075412438417395125]126}127},128{129"type":"Feature",130"properties":{131"total":1132},133"geometry":{134"type":"LineString",135"coordinates":[136[137-179.82421875,13844.213709909702054139],140[141-154.16015625,14232.69486597787505143],144[145-171.5625,14618.812717856407776147],148[149-145.72265625,1502.986927393334876151],152[153-158.37890625,154-30.145127183376115155]156]157}158},159{160"type":"Feature",161"properties":{162"total":1163},164"geometry":{165"type":"Polygon",166"coordinates":[167[168[169-50.9765625,17022.59372606392931171],172[173-23.37890625,17422.59372606392931175],176[177-23.37890625,17842.94033923363181179],180[181-50.9765625,18242.94033923363181183],184[185-50.9765625,18622.59372606392931187]188]189]190}191},192{193"type":"Feature",194"properties":{195"total":1196},197"geometry":{198"type":"Polygon",199"coordinates":[200[201[202-27.59765625,203-14.434680215297268204],205[206-24.43359375,207-32.10118973232094208],209[2100.87890625,211-31.653381399663985212],213[2142.28515625,215-12.554563528593656216],217[218-14.23828125,219-0.3515602939922709220],221[222-27.59765625,223-14.434680215297268224]225]226]227}228}229]230}231}232return{233"nodeid":node.nodeid,234"name":"resources-%s"%node.nodeid,235"source":json.dumps(source_config),236"count":count237}238239defget_pref_label(self,nodevalue):240"""241 Gets the prefLabel of a concept value242 """243returnNone244245defget_display_value(self,tile,node):246"""247 Returns a list of concept values for a given node248 """249returnunicode(tile.data[str(node.nodeid)])250251defget_search_terms(self,nodevalue,nodeid=None):252"""253 Returns a nodevalue if it qualifies as a search term254 """255return[]256257defappend_search_filters(self,value,node,query,request):258"""259 Allows for modification of an elasticsearch bool query for use in260 advanced search261 """262pass263264defhandle_request(self,current_tile,request,node):265"""266 Updates files267 """268pass
Here, you write logic that the Tile model will use to accept or reject
a Node’s data before saving. This is the core implementation of what
your DataType is and is not.
The validate method returns an array of errors. If the array is
empty, the data is considered valid. You can populate the errors array
with any number of dictionaries with a type key and a message
key. The value for type will generally be ERROR, but you can
provide other kinds of messages.
In this method, you’ll create an ElasticSearch query Nodes matching
this datatype based on input from the user in the Advanced Search
screen. (You design this input form in your DataType’s front-end
component.)
Arches has its own ElasticSearch query DSL builder class.
You’ll want to review that code for an idea of what to do. The search
view passes your DataType a Bool() query from this class, which you
call directly. You can invoke its must, filter, should, or
must-not methods and pass complex queries you build with the DSL
builder’s Match class or similar. You’ll execute this search
directly in your append_search_filters method.
In-depth documentation of this part is planned, but for now, look at
the core datatypes
located in Arches’ source code for examples of the approaches you can
take here.
Note
If you’re an accomplished Django developer, it should also be
possible to use Elastic’s own Python DSL builder in your Project
to build the complex search logic you’ll pass to Arches’ Bool()
search, but this has not been tested.
Required The name of your datatype. The convention in
Arches is to use kebab-case here.
iconclass:
Required The FontAwesome icon class your DataType should
use. Browse them here.
modulename:
Required This should always be set to datatypes.py
unless you’ve developed your own Python module to hold your
many DataTypes, in which case you’ll know what to put here.
classname:
Required The name of the Python class implementing your
datatype, located in your DataType’s Python file below these
details.
defaultwidget:
Required The default Widget to be used for this DataType.
defaultconfig:
Optional You can provide user-defined default
configuration here.
configcomponent:
Optional If you develop a configuration component, put the
fully-qualified name of the view here. Example:
views/components/datatypes/sample-datatype
configname:
Optional The name of the Knockout component you have
registered in your UI component’s JavaScript file.
isgeometric:
Required Used by the Arches UI to determine whether to
create a Map Layer based on the DataType, and also for
caching. If you’re developing such a DataType, set this to
True.
issearchable:
Optional Determines if the datatype participates in advanced search.
The default is false.
Important
configcomponent and configname are required together.
Your component JavaScript file should register a Knockout component
with your DataType’s configname. This component should be an
object with two keys: viewModel, and template
The value for viewModel should be a function where you put the
logic for your template. You’ll be setting up Knockout observable and
computed values tied to any form elements you’ve developed to collect
Advanced Search or Node-level configuration information from the user.
The value for template should be another object with the key
require, and the value should be
text!datatype-config-templates/<your-datatype-name>. Arches will
know what to do with this – it comes from the value you supplied in
your Python file’s details dictionary for configcomponent.
Pulling it all together, here’s the JavaScript portion of Arches’
date DataType.
If you’re supporting Advanced Search functionality for Nodes with your
DataType, your Django template will include a search block,
conditionally rendered by Knockout.js if the search view is
active. Here’s the one from the boolean datatype:
Note the <!--koif:$data.search--> directive opening and
closing the search block. This is not an HTML comment – it’s
Knockout.js-flavored markup for the conditional rendering.
Arches’ built-in date DataType does not use the Django template
block directive, but only implements advanced search, and contains
a more sophisticated example of the component logic needed:
This section of your template should be enclosed in Knockout-flavored
markup something like: <!--koif:$data.graph-->, and in your
Knockout function you should follow the convention and end up with
something like if(this.graph){
Here, you put form elements corresponding to any configuration you’ve
implemented in your DataType. These should correspond to keys in your
DataType’s defaultconfig.
Arches’ boolean DataType has the following defaultconfig:
{'falseLabel':'No','trueLabel':'Yes'}
You can see the corresponding data bindings in the Django template:
Functions are the most powerful extension to Arches. Functions
associated with a Resource are called during various CRUD operations,
and have access to any server-side model. Proficient Python/Django
developers will find few limitations extending an Arches Project with
Functions.
Function must be created, registered, and then associated with a
Resource Model.
A Function comprises three separate files, which should be seen as
front-end/back-end complements. On the front-end, you will need a
component made from a Django HTML template and JavaScript pair, which
should share the same basename.
In your Project, these files must be placed like so:
The third file is a Python file which contains a dictionary telling
Arches some important details about your Function, as well as its main
logic.
/myproject/myproject/functions/spatial_join.py
Note
As in the example above, its advisable that all of your files share
the same basename. (If your Function is moved into a Package, this
is necessary.) A new Project should have an example function in it
whose files you can copy to begin this process.
The first step in creating a function is defining the details that
are in the top of your Function’s .py file.
details={'name':'Sample Function','type':'node','description':'Just a sample demonstrating node group selection','defaultconfig':{"selected_nodegroup":""},'classname':'SampleFunction','component':'views/components/functions/sample-function'}
name:
Required Name is used to unregister a function, and shows up
in the fnlist command.
type:
Required As of version 4.2, this should always be set to node
description:
Optional Add a description of what your Function does.
defaultconfig:
Required A JSON object with any configuration needed to
serve your function’s logic
classname:
Required The name of the python class that holds this
Function’s logic.
Any configuration information you need your Function to access can be
stored here. If your function needs to calculate something based on
the value of an existing Node, you can refer to it here. Or, if you
want your Function to e-mail an administrator whenever a specific node
is changed, both the Node ID and the email address to be used are good
candidates for storage in the defaultconfig dictionary.
The defaultconfig field serves both as a default, and as your
user-defined schema for your function’s configuration data. Your
front-end component for the function will likely collect some of this
configuration data from the user and store it in the config
attribute of the pertinent FunctionXGraph.
In your Function’s Python code, you have access to all your
server-side models. You’re basically able to extend Arches in any way
you please. You may want to review the Data Model
documentation.
Your function needs to extend the BaseFunction class. Depending on
what you are trying to do, you will need to implement the get,
save, delete, on_import, and/or after_function_save
methods.
Not all of these methods are called in the current Arches
software. You can also leave any of them unimplemented, and the
BaseFunction class will raise a NotImplementedError for
you. Arches is designed to gracefully ignore these exceptions for
functions.
A detailed description of current functionality is below.
The Tile object will look up all its Graph’s associated Functions
upon being saved. Before writing to the database, it calls each
function’s save method, passing itself along with the Django
Request object. This is likely where the bulk of your function’s
logic will reside.
The Tile object similarly calls each of its graph’s
functions’ delete methods with the same parameters. Here, you can
execute any cleanup or other desired side effects of a Tile’s
deletion. Your delete implementation will have the same signature
as save.
The Graph view passes a FunctionXGraph object to
after_function_save, along with the request.
The FunctionXGraph object has a config attribute which stores that
instance’s version of the defaultconfig dictionary. This is a good
opportunity, for example, to programmatically manipulate the
Function’s configuration based on the Graph or any other server-side
object.
You can also write any general logic that you’d like to fire upon the
assignment of a Function to a Resource.
Having implemented your function’s logic, it’s time to develop the
front-end components required to associate it with Resources and
provide any configuration data.
The component you develop here will be rendered in the Resource
Manager when you associate the function with a Resource, and this is
where you’ll put any forms or other UI artifacts used to configure the
Function.
Developing your Function’s UI component is very similar to developing
Widgets. More specific guidelines are in progress, but for now,
refer to the sample code in your project’s
templates/views/components/functions/ directory, and gain a little
more insight from the templates/views/components/widgets/
directory. The complementary JavaScript examples will be located in
media/js/views/components/functions/ and
media/js/views/components/widgets directories.
Now navigate to the Function Manager in the Arches Designer to confirm
that your new function is there and functional. If it’s not, you may
want to unregister your function, make additional changes, and
re-register it. To unregister your function, simply run
Plugins allow a developer to create an independent page in Arches that is accessible from the main navigation menu.
For example, you may need a customized way of visualizing your resource data. A plugin would enable you to design such an interface.
Plugins, like widgets and card components rely only on front-end code. Ajax queries, generally calls to the API, must be used to access any server side data.
Optional A UUID4 for your Plugin. Feel free to generate
one in advance if that fits your workflow; if not, Arches will
generate one for you and print it to STDOUT when you register
the Plugin.
name:
Required The name of your new Plugin, visible when a user hovers over the main navigation menu
icon:
Required The icon visible in the main navigation menu.
component:
Required The path to the component view you have
developed. Example: views/components/plugins/sample-plugin
componentname:
Required Set this to the last part of component above.
config:
Required You can provide user-defined default
configuration here. Make it a JSON dictionary of keys and
values. An empty dictionary is acceptable.
slug:
Required The string that will be used in the url to access your plugin
sortorder:
Required The order in which your plugin will be listed if there are multiple plugins
Arches enables projects to have custom reports on a per-resource model basis. Below is a guide to create and implement a custom resource report.
In your project, you’ll need to create files in the following directories. If any directories listed here do not exist in your project, create them first.
Sample report .htm file (note that extending the core arches default report is optional. See core arches default report for reference on overriding specific tagged sections, e.g. “{% block header %}”.):
{% extends "views/report-templates/default.htm" %}
{% load i18n %}
{% block body %}
<!--ko if: hasProvisionalData() && (editorContext === false) --><divclass="report-provisional-flag">{% trans 'This resource has provisional edits (not displayed in this report) that are pending review' %}</div><!--/ko--><!--ko if: hasProvisionalData() && (editorContext === true && report.userisreviewer === true) --><divclass="report-provisional-flag">{% trans 'This resource has provisional edits (not displayed in this report) that are pending review' %}</div><!--/ko--><!--ko if: hasProvisionalData() && (editorContext === true && report.userisreviewer === false) --><divclass="report-provisional-flag">{% trans 'This resource has provisional edits that are pending review' %}</div><!--/ko--><divclass="rp-report-section relative rp-report-section-root"><divclass="rp-report-section-title"><!-- ko foreach: { data: report.cards, as: 'card' } --><!-- ko if: !!(ko.unwrap(card.tiles).length > 0) --><!-- ko if: $index() !== 0 --><hrclass="rp-tile-separator"><!-- /ko --><divclass="rp-card-section"><!-- ko component: { name: card.model.cardComponentLookup[card.model.component_id()].componentname, params: { state: 'report', preview: $parent.report.preview, card: card, pageVm: $root, hideEmptyNodes: $parent.hideEmptyNodes } } --><!-- /ko --></div><!-- /ko --><!-- /ko --></div></div>
{% endblock body %}
Before registering your report, ensure that named references to the various report files are consistent. For ease, it is recommended to use one single name for all files to match the component name. Check the named references in your .js file to your component as well as the template name in case you encounter issues later.
Finally, in the Arches Graph Designer interface, navigate to the “Cards” tab of the resource model this report is for, click the root/top node in the card tree (is the name of the graph/resource model) in the left-hand side. On the far-right you will see a heading “Report Configuration”. Select your custom report from the dropdown labeled “Template”, and save changes.
Troubleshooting Tips
Ensure that all references to a component name are consistent.
Ensure that references to a template (.htm file) are consistent.
Ensure your report exists in your database by checking the “report_templates” table.
Further Interest
Because templates often call other templates, e.g. the default report template for a resource instance in turn calls the default card component template, it may be of interest to either override or create a custom component for cards which get rendered within resource reports.
Widgets allow you to customize how data of a certain DataType is
entered into Arches, and further customize how that data is presented
in Reports. You might have several Widgets for a given DataType,
depending on how you want the Report to look or to match the context
of a certain Resource.
Widgets are primarily a UI artifact, though they are closely tied to
their underlying DataType.
To develop a custom Widget, you’ll need to write three separate files,
and place them in the appropriate directories. For the appearance and
behavior of the Widget, you’ll need a component made of a Django
template and JavaScript file placed like so:
The most important field here is the datatype field. This controls
where your Widget will appear in the Arches Resource Designer. Nodes
each have a DataType, and Widgets matching that DataType will be
available when you’re designing your Cards. The value must match an
existing DataType within Arches.
You can also populate the defaultconfig field with any
configuration data you wish, to be used in your Widget’s front-end
component.
Your Widget’s template needs to include three Django template “blocks”
for rendering the Widget in different contexts within Arches. These
blocks are called form, config_form, and report. As you might
guess from their names, form is rendered when your Widget appears on
a Card for business data entry, config_form is rendered when you
configure the Widget on a card when designing a Resource, and report
controls how data from your Widget is presented in a Report.
To pull it all together, you’ll need to write a complementary
JavaScript file. The Arches UI uses Knockout.js, and the best way to
develop your Widget in a compatible way is to write a Knockout
component with a viewModel corresponding to your Widget’s view
(the Django template).
Here is an example, continuing with our sample-widget:
define(['knockout','underscore','viewmodels/widget','templates/views/components/widgets/sample-widget.htm'],function(ko,_,WidgetViewModel,sampleWidgetTemplate){/** * registers a text-widget component for use in forms * @function external:"ko.components".text-widget * @param {object} params * @param {string} params.value - the value being managed * @param {function} params.config - observable containing config object * @param {string} params.config().label - label to use alongside the text input * @param {string} params.config().placeholder - default text to show in the text input */returnko.components.register('sample-widget',{viewModel:function(params){params.configKeys=['x_placeholder','y_placeholder'];WidgetViewModel.apply(this,[params]);varself=this;if(this.value()){varcoords=this.value().split('POINT(')[1].replace(')','').split(' ')varsrid=this.value().split(';')[0].split('=')[1]this.x_value=ko.observable(coords[0]);this.y_value=ko.observable(coords[1]);this.srid=ko.observable('4326');}else{this.x_value=ko.observable();this.y_value=ko.observable();this.srid=ko.observable('4326');};this.preview=ko.pureComputed(function(){varres="SRID="+this.srid()+";POINT("+this.x_value()+" "+this.y_value()+")"this.value(res);returnres;},this);},template:sampleWidgetTemplate,});});
Workflows are a type of Plugin that can simplify the data entry process. A workflow is composed of one or more cards from a resource model, placing them in a step-through set of forms. This provides users the ability to create new resource instances without having to traverse card-by-card through the resource model tree.
In other words, instead of using this interface to create a new resource:
…a workflow can pare down the data entry interface to look something like this:
A simple workflow abstracts data entry away from the card tree into forms.#
Workflows can be complex too, facilitating the creation of many different inter-related resource instances simultaneously. We’ll use a very simple example here, however, to show how a workflow can be used to extract just a few cards from a large resource model to facilitate a “quick create” task that is easy for users to complete.
A very simple workflow will be presented here, based on the arches-example-pkg resource model called “Heritage Resource Model”. This resource model has many cards, but we will make a workflow that pulls out just three of these cards–Name/Name Type, Resource Type Classification, and Keyword.
Workflows follow the standard extension pattern: an HTML/JS component and a JSON config. For this example, we have registration configs stored here:
Because Workflows are just Plugins, their registration configurations are constructed the same. See Registering your Plugin for more about how to create the JSON file. For our purposes, it will look like this:
The workflow’s behavior is defined in quick-resource-create-workflow.js. You’ll begin with the boilerplate content below. Note that:
The file name, registered component name, and this.componentName must all match.
The stepConfig attribute will hold the full list of configurations for each step of the workflow.
define(['knockout','jquery','arches','viewmodels/workflow','templates/views/components/plugins/quick-resource-create-workflow.htm',// DEFINE EXTRA STEP COMPONENTS HERE AS NEEDED'views/components/workflows/final-step'],function(ko,$,arches,Workflow,quickResourceCreateWorkflowTemplate){returnko.components.register('quick-resource-create-workflow',{viewModel:function(params){this.componentName='quick-resource-create-workflow';this.quitUrl="/search";this.stepConfig=[// ADD STEP CONFIG ITEMS HERE];Workflow.apply(this,[params]);},template:quickResourceCreateWorkflowTemplate,});});
Now let’s look at what one of these stepConfig items should look like. At minimum, it will have the following properties:
title:
This will appear in the tab for the step.
name:
An interal id for this workflow step that may be referenced by later steps, for example 'initial-step'. This value must be unique across all other steps in the workflow.
required:
Use true for false to determine whether this step must be completed by the user before moving on to the next one.
workflowstepclass:
The class for this step. (Need more info here)
informationboxdata:
The information box gives users guidance on how to complete the workflow step, and must consist of heading and text elements (see example below).
layoutSections:
A list of the sections that appear within this step. These items will be covered in more detail next.
Put together, a stepConfig will look something like this:
{title:'Create Historic Resource',name:'set-basic-info',required:true,workflowstepclass:'create-project-project-name-step',informationboxdata:{heading:'Create historic resource here',text:'Begin by providing the name and type of historic resource you are adding to the database.',},layoutSections:[// ADD LAYOUT SECTIONS HERE]}
Other properties may be present in a step config if they are needed for more complex workflows.
A workflow step can have one or more layoutSections, each of which contains a list of componentConfigs. Component configs are where we reference the part of the resource model that we want users to access. Simple workflows like our example can use multiple component configs that point to different nodegroups, but more complex steps will typically have only one layout section with one component config, the latter ultimately pointing to a custom step component.
The properties of a component config are as follows:
componentName:
The id of the UI component that will be used to render this piece of the step. This can be 'default-card' to use Arches’ default display. However, you can also write workflow-specific step components to handle more complex behavior, and this is where you would reference them. (Any custom step components used here must be added to the define list at the top of the file.)
uniqueInstanceName:
An id by which this component can be referenced. This value must be unique across all other component configs in the step.
tilesManaged:
This must be 'none', 'one', or 'multi', and it determines how many new tiles will be created with this component. Even if the card has a cardinality > 1 in the resource model, setting 'one' here will still disallow multiple values from being created.
parameters:
These parameters will be passed to the component. Typically, in the first step you will only use graphid (for the resource model) and nodegroupid (to determine which nodegroup/card to show). Later steps will also need to be passed the resourceid which is pulled from the first step. Keep in mind that custom step components may require extra parameters.
{componentName:'default-card',uniqueInstanceName:'resource-name',/* unique to step */tilesManaged:'one',parameters:{graphid:'99417385-b8fa-11e6-84a5-026d961c88e6',nodegroupid:'574b58a3-e747-11e6-84a6-026d961c88e6',}}
In this example, graphid refers to the UUID for the Heritage Resource Model, and nodegroupid is the UUID for the nodegroup that holds the Name and Name Type nodes.
Note
There are a couple of ways to find the nodegroupid.
In the Arches web UI, open the graph designer for the resource model and use your browser’s dev tools to isolate the element for the nodegroup you want. The UUID will be visible in the HTML.
In the second step of our example workflow, where the user will enter a keyword for the new resource, we’ll need to pass an extra parameter resourceid that was created in the first step. Doing so looks like this:
parameters:{graphid:'99417385-b8fa-11e6-84a5-026d961c88e6',nodegroupid:'3d919f0d-e747-11e6-84a6-026d961c88e6',// UUID for the Keyword nodegroupresourceid:"['set-basic-info']['resource-name'][0]['resourceInstanceId']",}
To break this resourceid entry down:
'set-basic-info' is the name of the step from which we are pulling the id (see our first step above)
'resource-name' is the uniqueInstanceName of the component config in which the tile was created
0 is the first tile object
'resourceInstanceId' is the property of the tile that we are looking for
Patterns like this can be used elsewhere within workflows to pass information from step to step.
The final step of our example workflow looks like this:
{title:'Finish',name:'add-resource-complete',/* unique to workflow */description:'Finish the resource creation.',layoutSections:[{componentConfigs:[{componentName:'final-step',uniqueInstanceName:'create-resource-final',tilesManaged:'none',parameters:{resourceid:"['set-basic-info']['resource-name'][0]['resourceInstanceId']",},},],},],}
As you can see, no tiles are created here, and we are using the default 'final-step' component that Arches provides (you’ll note this component is defined at the top of the file). This step will contain a save/cancel prompt.
Workflows often contain more elaborate final steps than the default one presented here, for example you may want to list all of the data that has been entered throughout the workflow so the user can review it before saving. This behavior is not available by default, but here is an example of a final step with that capability:
Putting it all together, our main workflow component looks like this:
define(['knockout','jquery','arches','viewmodels/workflow','templates/views/components/plugins/quick-resource-create-workflow.htm','views/components/workflows/final-step'],function(ko,$,arches,Workflow,quickResourceCreateWorkflowTemplate){returnko.components.register('quick-resource-create-workflow',{viewModel:function(params){this.componentName='quick-resource-create-workflow';this.quitUrl="/search";this.stepConfig=[{title:'Create Historic Resource',name:'set-basic-info',/* unique to workflow */required:true,workflowstepclass:'create-project-project-name-step',informationboxdata:{heading:'Create historic resource here',text:'Begin by providing the name and type of historic resource you are adding to the database.',},layoutSections:[{componentConfigs:[{componentName:'default-card',uniqueInstanceName:'resource-name',/* unique to step */tilesManaged:'one',parameters:{graphid:'99417385-b8fa-11e6-84a5-026d961c88e6',nodegroupid:'574b58a3-e747-11e6-84a6-026d961c88e6',},},],},{componentConfigs:[{componentName:'default-card',uniqueInstanceName:'resource-type',/* unique to step */tilesManaged:'one',parameters:{graphid:'99417385-b8fa-11e6-84a5-026d961c88e6',nodegroupid:'620aac67-e747-11e6-84a6-026d961c88e6',},},],},]},{title:'Add Keywords',name:'add-keywords',/* unique to workflow */required:false,informationboxdata:{heading:'Add a keyword',text:'Optionally add keywords to this historic resource.',},layoutSections:[{componentConfigs:[{componentName:'default-card',uniqueInstanceName:'resource-keywords',/* unique to step */tilesManaged:'one',parameters:{graphid:'99417385-b8fa-11e6-84a5-026d961c88e6',nodegroupid:'3d919f0d-e747-11e6-84a6-026d961c88e6',resourceid:"['set-basic-info']['resource-name'][0]['resourceInstanceId']",},},],},],},{title:'Finish',name:'add-resource-complete',/* unique to workflow */description:'Finish the resource creation.',layoutSections:[{componentConfigs:[{componentName:'final-step',uniqueInstanceName:'create-resource-final',tilesManaged:'none',parameters:{resourceid:"['set-basic-info']['resource-name'][0]['resourceInstanceId']",},},],},],}];Workflow.apply(this,[params]);},template:quickResourceCreateWorkflowTemplate,});});
You may want to create custom components for your workflow steps to handle more complex data entry tasks. These should be stored in a workflow directory, or grouped into subdirectories thematically. A step component can be used by any workflow, as long as it is passed the correct parameters.
Important
If you are loading a package with a workflow in it, you will need to manually copy step component files into your project–they are not handled by the package load process.
Here are some examples of workflows that use custom step components you can look at when beginning to construct your own:
After placing your workflow files in the proper directories within your project, you are ready to register it. See Plugin Commands for more information.
If the Workflow (or any other Plugin) is registered but is not visible, an administrator must grant access to it via the Django admin panel on a per-user or per-group basis.
Navigate to localhost:8000/admin and login, and locate profiles for the user(s) or group(s) that should be able to access the Workflow. Find the “User/Group permissions” section, scroll to your workflow, and add the “view” privilege. Click “SAVE” to finish.
Used to create modular data entry and data display units that can be nested arbitrarily in the UI. Creating a custom card component can provide a more complex UI.
Workflows are a special type of Plugin that allow you to abstract the data entry process away from the default graph tree interface into a step-through set of pages.
Basic Structure
my_project/plugins/my-workflow.json <-- Registration config
my_project/media/js/views/components/plugins/my-workflow.js <-- Main UI component
my_project/templates/views/components/plugins/my-workflow.htm <-- Main UI component
Though there is some variation across extension types, Arches does use a common architecture pattern to construct extensions. Generally speaking, the user interface for the extension exists in a new component (JS/HTML), and any backend code (if applicable) will be in a module (Python). Initial configuration details will be stored in json, either in a standalone file or at the top of a module.
All extensions are expected to have some sort of user interface, and this is created with a pair of files: one HTML template (.htm) and one JavaScript file (.js). Components are constructed with KnockoutJS.
These files must live here (using a widget as an example):
A .json file will store a set of initial configuration details about the extension, which are loaded into the database when the extension is registered. Thisfileisonlyusedduringregistration.
A few extension types, like Functions, are written in Python. For these, a .py module must be supplied. Instead of a JSON configuration file, initial configs are stored at the beginning of the module in a dictionary named details.
A module’s location follows the same pattern as JSON configuration files:
Extensions are “registered” and “unregistered” in one of two ways:
Via the CLI
As part of a package load
For CLI commands, see the end of each extension type’s documentation, or checkout the Command Line Reference page. Checkout Understanding Packages for more about how and where extensions are stored within packages.
This means that when you run yarninstall on this file, all dependencies from the corresponding branch of the core Arches repo will be installed (in this example, package.json from stable/6.0.1).
To add a new package, you just need to run yarnadd<packagename> in your project. This will install the new package and update your package.json file accordingly.
For example, to add OpenLayers, enter the my_project directory and run yarnaddol. Your package.json will now look something like:
If you are developing a project, keep track of which version of Arches you are developing against and make sure it is properly reflected in your package.json.
Note
When you register a new extension there is no way to automaticate the installation of a new JS dependency, so you’ll need to manually run yarnadd as described above.
A developer can add new layers to the map by registering them through the command line interface.
New map layers can come from many different geospatial sources – from shapefiles to GeoTIFFs to external Web Map Services to reconfigurations of the actual resource data stored within Arches.
New map layers can be created with two general definitions, as MapBox layers or tileserver layers, each with its own wide range of options.
Arches allows you to make direct references to styles or layers that have been previously defined in MapBox Studio. You can make entirely new basemap renderings, save them in your MapBox account, then download the style definition and use it here. Read more about MapBox Styles.
Additionally, you can take a MapBox JSON file and place any mapbox.js layer definition in the layers section, as long as you define its source in the sources section.
Note
One thing to be aware of when trying to cascade a WMS through a MapBox layer is that mapbox.js is much pickier about CORS than other js mapping libraries like Leaflet. To use an external WMS or tileset, you may be better off using a tileserver layer as described below. You can find WMS examples in the arches4-geo-examples repo.
In Arches, it’s possible to add a vector layer whose features may be “selectable”. This is especially useful during drawing operations. For example, a building footprint dataset could be added as a selectable vector layer, and while creating new building resources you would select and “transfer” these geometries from the overlay to the new Arches resource.
First, the data source for the layer may be geojson or vector tiles. This could be a tile server layer serving vector features from PostGIS, for example.
Add a property to your vector features called “geojson”.
Populate this property with either the entire geojson geometry for the feature, or a url that will return a json response containing the entire geojson geometry for the feature. This is necessary to handle the fact that certain geometries may extend across multiple vector tiles.
Add the overlay as you would any tileserver layer (see above).
You will now be able to add this layer to the map and select its features by clicking on them.
In addition to making overlay features selectable, you can define styles for their hover and click states.
To do so, each feature in your overlay needs a unique _featureid. If you’re overlay served from PostGIS, you can define this property in the layer config’s queries array like so:
"queries":["select gid as __id__, gid as _featureid, site_name, feature_info_content, st_asgeojson(geom) as geojson, st_transform(geom, 900913) as __geometry__ from example_layer"]
Next you will need to ensure your source-layer is properly defined. In the source layer the source-layer property must match the id property and cannot contain spaces or periods. This layer will be hidden when the hover or click layer is revealed, so this should be a fill layer if your click or hover layers contain a fill.
Define the hover and click layers. These each must have a _featureid filter their ids must be suffixed with either a -click or -hover. For example:
If you are loading your layers from a package, each layer must have an accompanying meta.json file with a name defined. This will ensure that the source-layer property is saved to the layer as you intend. If you do not have a meta.json file, the source-layer name will be the map layer’s file name, and will probably not work properly. See the example package for an example:
You can display custom HTML in the search map popup when a user hovers or clicks on a feature in a vector layer.
First, the data source for the layer may be geojson or vector tiles. This could be a tile server layer serving vector features from PostGIS, for example.
Add a property to your vector features called “feature_info_content”.
Populate this property with either an html element or a url from which to load html. If you use a url, you will need to update the ‘ALLOWED_POPUP_HOSTS’ to include the host from which you want to request HTML.
Add the overlay as you would any tileserver layer (see above).
You will now be able to add this layer to the map see the markup defined in the ‘feature_info_content’ in the search map popup.
The HTML export templates are used to allow search results in Arches to be exported as a set of html files containing formatted report styled documents.
They are designed to be be read offline by embedding a small CSS framework called Milligram, along with some additional elements.
The templates must be written by a developer and added to the Arches project template folder.
There can only be one HTML export template for each model in the application.
When the export process loads the template ready for it to be rendered, it will be passed a resources context object that contains the data for the resources to be written. This list is then iterated to build each record.
The structure of each resources object is as below:
{"displaydescription":" Excavation by Department Of Greater London Archaeology, April to May 1990, found a large 'soft spot' which was either a quarry ditch or was dug for dumping waste. Also uncovered features of 20th century date relating to a building called Green Acres.","displayname":"Open Area Excavation at Lichfield Gardens","graph_id":"b9e0701e-5463-11e9-b5f5-000d3ab1e588","legacyid":"06eb7a47-baf7-4c79-aeab-2ffabe2502ea","map_popup":"<Activity Descriptions>","resource":{"...":"..."},"resourceinstanceid":"06eb7a47-baf7-4c79-aeab-2ffabe2502ea"}
Some id and display information is directly accessible from the object, such as the displayname and displaydescription. The rest of the resource data is contained in the resource dictionary as a “disambiguated” version of the resource.
The dict uses the branch/card/node names as the keys, with the @display_value key containing the presentation value for that node. There are other values included in the dicts that can be used to add richer functionality if needed.
Note
This structure closely matches the JSON produced when looking at a resource when specifying the format=json&v=beta
Ensure that you use the v=beta parameter as the functionality is using this version of the data formatting.
"resource":{"Descriptions":[{"Description":{"@display_value":"Amendment date:none"},"Description Language":{"Description Language Metatype":null},"Description Type":{"@display_value":"Notes","Description Metatype":null,"concept_id":"f1cbae8f-0090-47dc-8252-ee533a2deb29","language_id":"en","value":"Notes","valueid":"daa4cddc-8636-4842-b836-eb2e10aabe18","valuetype_id":"prefLabel"}}],"Designation and Protection Assignment":[],"Heritage Area Names":[],"Location Data":{},"System Reference Numbers":{}}
The resources context data sent to the templates has an incompatibility with the standard Django dot notation for accessing values. Usually you would access a dictionary value using dict.key, but the resource dictionary uses keys with spaces that can’t be parsed resource_data.ActivityNames.
The other difficulty is that the resource dictionary may not contain the key you are looking for if no tiles for that data exists. Therefore, you need to check for the existence of the key before you access it.
To solve this, two new template filters were added to arches/arches/templatetags/template_tags.py:
has_key
You can use has_key as part of an if tag to check if there is a key in the object. If you try to access the object without checking then it may error should the key not be present.
{% if asset_names|has_key:"Asset Name Use Type" %}
{# you can access without error asset_names["Asset Name Use Type"] #}
{% endif %}
val_from_key
This function allows you to retrieve a value from a key that is not Django templating compliant. These can be chained to access nested dictionaries (careful that you a sure the nested dictionary exists).
The basic template below will provide the CSS framework, add the custom template tags (used to add custom functions to the template engine), and the initial resource loop within which to start your document.
Below is an example that includes sections that build tables and group elements together.
Note
Use <divclass="keeptogether"></div> blocks around tables and other iterated sections to force the styling to keep things on the same page where possible when printing.
{% load template_tags %}
<!DOCTYPE html><htmllang="en"><head><metahttp-equiv="Content-Type"content="text/html; charset=UTF-8"><metacharset="UTF-8"><metaname="viewport"content="width=device-width, initial-scale=1.0"><title>Report</title><style>/* Milligram css */*,*:after,*:before{box-sizing:inherit}html{box-sizing:border-box;font-size:62.5%}body{color:#606c76;font-family:'Roboto','Helvetica Neue','Helvetica','Arial',sans-serif;font-size:1.6em;font-weight:300;letter-spacing:.01em;line-height:1.6}blockquote{border-left:0.3remsolid#d1d1d1;margin-left:0;margin-right:0;padding:1rem1.5rem}blockquote*:last-child{margin-bottom:0}.button,button,input[type='button'],input[type='reset'],input[type='submit']{background-color:#9b4dca;border:0.1remsolid#9b4dca;border-radius:.4rem;color:#fff;cursor:pointer;display:inline-block;font-size:1.1rem;font-weight:700;height:3.8rem;letter-spacing:.1rem;line-height:3.8rem;padding:03.0rem;text-align:center;text-decoration:none;text-transform:uppercase;white-space:nowrap}.button:focus,.button:hover,button:focus,button:hover,input[type='button']:focus,input[type='button']:hover,input[type='reset']:focus,input[type='reset']:hover,input[type='submit']:focus,input[type='submit']:hover{background-color:#606c76;border-color:#606c76;color:#fff;outline:0}.button[disabled],button[disabled],input[type='button'][disabled],input[type='reset'][disabled],input[type='submit'][disabled]{cursor:default;opacity:.5}.button[disabled]:focus,.button[disabled]:hover,button[disabled]:focus,button[disabled]:hover,input[type='button'][disabled]:focus,input[type='button'][disabled]:hover,input[type='reset'][disabled]:focus,input[type='reset'][disabled]:hover,input[type='submit'][disabled]:focus,input[type='submit'][disabled]:hover{background-color:#9b4dca;border-color:#9b4dca}.button.button-outline,button.button-outline,input[type='button'].button-outline,input[type='reset'].button-outline,input[type='submit'].button-outline{background-color:transparent;color:#9b4dca}.button.button-outline:focus,.button.button-outline:hover,button.button-outline:focus,button.button-outline:hover,input[type='button'].button-outline:focus,input[type='button'].button-outline:hover,input[type='reset'].button-outline:focus,input[type='reset'].button-outline:hover,input[type='submit'].button-outline:focus,input[type='submit'].button-outline:hover{background-color:transparent;border-color:#606c76;color:#606c76}.button.button-outline[disabled]:focus,.button.button-outline[disabled]:hover,button.button-outline[disabled]:focus,button.button-outline[disabled]:hover,input[type='button'].button-outline[disabled]:focus,input[type='button'].button-outline[disabled]:hover,input[type='reset'].button-outline[disabled]:focus,input[type='reset'].button-outline[disabled]:hover,input[type='submit'].button-outline[disabled]:focus,input[type='submit'].button-outline[disabled]:hover{border-color:inherit;color:#9b4dca}.button.button-clear,button.button-clear,input[type='button'].button-clear,input[type='reset'].button-clear,input[type='submit'].button-clear{background-color:transparent;border-color:transparent;color:#9b4dca}.button.button-clear:focus,.button.button-clear:hover,button.button-clear:focus,button.button-clear:hover,input[type='button'].button-clear:focus,input[type='button'].button-clear:hover,input[type='reset'].button-clear:focus,input[type='reset'].button-clear:hover,input[type='submit'].button-clear:focus,input[type='submit'].button-clear:hover{background-color:transparent;border-color:transparent;color:#606c76}.button.button-clear[disabled]:focus,.button.button-clear[disabled]:hover,button.button-clear[disabled]:focus,button.button-clear[disabled]:hover,input[type='button'].button-clear[disabled]:focus,input[type='button'].button-clear[disabled]:hover,input[type='reset'].button-clear[disabled]:focus,input[type='reset'].button-clear[disabled]:hover,input[type='submit'].button-clear[disabled]:focus,input[type='submit'].button-clear[disabled]:hover{color:#9b4dca}code{background:#f4f5f6;border-radius:.4rem;font-size:86%;margin:0.2rem;padding:.2rem.5rem;white-space:nowrap}pre{background:#f4f5f6;border-left:0.3remsolid#9b4dca;overflow-y:hidden}pre>code{border-radius:0;display:block;padding:1rem1.5rem;white-space:pre}hr{border:0;border-top:0.1remsolid#f4f5f6;margin:3.0rem0}input[type='color'],input[type='date'],input[type='datetime'],input[type='datetime-local'],input[type='email'],input[type='month'],input[type='number'],input[type='password'],input[type='search'],input[type='tel'],input[type='text'],input[type='url'],input[type='week'],input:not([type]),textarea,select{-webkit-appearance:none;background-color:transparent;border:0.1remsolid#d1d1d1;border-radius:.4rem;box-shadow:none;box-sizing:inherit;height:3.8rem;padding:.6rem1.0rem.7rem;width:100%}input[type='color']:focus,input[type='date']:focus,input[type='datetime']:focus,input[type='datetime-local']:focus,input[type='email']:focus,input[type='month']:focus,input[type='number']:focus,input[type='password']:focus,input[type='search']:focus,input[type='tel']:focus,input[type='text']:focus,input[type='url']:focus,input[type='week']:focus,input:not([type]):focus,textarea:focus,select:focus{border-color:#9b4dca;outline:0}select{background:url('data:image/svg+xml;utf8,<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 30 8" width="30"><path fill="%23d1d1d1" d="M0,0l6,8l6-8"></path></svg>')centerrightno-repeat;padding-right:3.0rem}select:focus{background-image:url('data:image/svg+xml;utf8,<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 30 8" width="30"><path fill="%239b4dca" d="M0,0l6,8l6-8"></path></svg>')}select[multiple]{background:none;height:auto}textarea{min-height:6.5rem}label,legend{display:block;font-size:1.6rem;font-weight:700;margin-bottom:.5rem}fieldset{border-width:0;padding:0}input[type='checkbox'],input[type='radio']{display:inline}.label-inline{display:inline-block;font-weight:normal;margin-left:.5rem}.container{margin:0auto;max-width:112.0rem;padding:02.0rem;position:relative;width:100%}.row{display:flex;flex-direction:column;padding:0;width:100%}.row.row-no-padding{padding:0}.row.row-no-padding>.column{padding:0}.row.row-wrap{flex-wrap:wrap}.row.row-top{align-items:flex-start}.row.row-bottom{align-items:flex-end}.row.row-center{align-items:center}.row.row-stretch{align-items:stretch}.row.row-baseline{align-items:baseline}.row.column{display:block;flex:11auto;margin-left:0;max-width:100%;width:100%}.row.column.column-offset-10{margin-left:10%}.row.column.column-offset-20{margin-left:20%}.row.column.column-offset-25{margin-left:25%}.row.column.column-offset-33,.row.column.column-offset-34{margin-left:33.3333%}.row.column.column-offset-40{margin-left:40%}.row.column.column-offset-50{margin-left:50%}.row.column.column-offset-60{margin-left:60%}.row.column.column-offset-66,.row.column.column-offset-67{margin-left:66.6666%}.row.column.column-offset-75{margin-left:75%}.row.column.column-offset-80{margin-left:80%}.row.column.column-offset-90{margin-left:90%}.row.column.column-10{flex:0010%;max-width:10%}.row.column.column-20{flex:0020%;max-width:20%}.row.column.column-25{flex:0025%;max-width:25%}.row.column.column-33,.row.column.column-34{flex:0033.3333%;max-width:33.3333%}.row.column.column-40{flex:0040%;max-width:40%}.row.column.column-50{flex:0050%;max-width:50%}.row.column.column-60{flex:0060%;max-width:60%}.row.column.column-66,.row.column.column-67{flex:0066.6666%;max-width:66.6666%}.row.column.column-75{flex:0075%;max-width:75%}.row.column.column-80{flex:0080%;max-width:80%}.row.column.column-90{flex:0090%;max-width:90%}.row.column.column-top{align-self:flex-start}.row.column.column-bottom{align-self:flex-end}.row.column.column-center{align-self:center}@media(min-width:40rem){.row{flex-direction:row;margin-left:-1.0rem;width:calc(100%+2.0rem)}.row.column{margin-bottom:inherit;padding:01.0rem}}a{color:#9b4dca;text-decoration:none}a:focus,a:hover{color:#606c76}dl,ol,ul{list-style:none;margin-top:0;padding-left:0}dldl,dlol,dlul,oldl,olol,olul,uldl,ulol,ulul{font-size:90%;margin:1.5rem01.5rem3.0rem}ol{list-style:decimalinside}ul{list-style:circleinside}.button,button,dd,dt,li{margin-bottom:1.0rem}fieldset,input,select,textarea{margin-bottom:1.5rem}blockquote,dl,figure,form,ol,p,pre,table,ul{margin-bottom:2.5rem}table{border-spacing:0;display:block;overflow-x:auto;text-align:left;width:100%}td,th{border-bottom:0.1remsolid#e1e1e1;padding:1.2rem1.5rem}td:first-child,th:first-child{padding-left:0}td:last-child,th:last-child{padding-right:0}@media(min-width:40rem){table{display:table;overflow-x:initial}}b,strong{font-weight:bold}p{margin-top:0}h1,h2,h3,h4,h5,h6{font-weight:300;letter-spacing:-.1rem;margin-bottom:2.0rem;margin-top:0}h1{font-size:4.6rem;line-height:1.2}h2{font-size:3.6rem;line-height:1.25}h3{font-size:2.8rem;line-height:1.3}h4{font-size:2.2rem;letter-spacing:-.08rem;line-height:1.35}h5{font-size:1.8rem;letter-spacing:-.05rem;line-height:1.5}h6{font-size:1.6rem;letter-spacing:0;line-height:1.4}img{max-width:100%}.clearfix:after{clear:both;content:' ';display:table}.float-left{float:left}.float-right{float:right}/* General */html{font-size:55%}body{color:#000}.section-titleblockquote{border:.3remsolid#d1d1d1;background-color:#90C0D8}h3{margin-bottom:.2rem}.container{margin:0;max-width:100%}hr{border-top:.3remsolid#d1d1d1;margin:2rem0}ul{list-style:none}.location-detailsli{margin-bottom:0}@mediaprint{.section-title:not(:first-child){page-break-before:always}.keeptogether{break-inside:avoid}}/* Responsive tables */.rtable{margin:0040px0;width:100%;box-shadow:01px3pxrgba(0,0,0,.2);display:table}@mediascreenand(max-width:580px){.rtable{display:block}}.rrow{display:table-row}.rrow:nth-of-type(odd){background-color:#fff}.rrow.rheader{font-weight:600;background:#d1d1d1}@mediascreenand(max-width:1024px){.rrow{padding:14px07px;display:block;border-bottom:1pxsolid#d1d1d1}.rrow.rheader{padding:0;height:6px}.rrow.rheader.rcell{display:none}.rrow.rcell{margin-bottom:10px;border:none}.rrow.rcell:before{margin-bottom:3px;content:attr(data-title);min-width:98px;font-size:.85em;line-height:10px;font-weight:700;text-transform:uppercase;display:block}}.rcell{padding:6px12px;display:table-cell;border-bottom:1pxsolid#d1d1d1}.rcellul{list-style:none}.rcellli{margin-bottom:0}@mediascreenand(max-width:1024px){.rcell{padding:2px16px;display:block}}@mediaprint{html{font-size:40%}.rtable{box-shadow:none;border:1pxsolid#d1d1d1}.rcell{border:1pxsolid#d1d1d1}.location-details.row{flex-direction:row}}</style></head><body><header><h1>Heritage Assets</h1></header><main>
{% for resource in resources %}
{% with resource_data=resource.resource %}
<sectionclass="section-title"><blockquote>
{% if resource_data|has_key:"Heritage Asset Names" %}
{% for n in resource_data|val_from_key:"Heritage Asset Names" %}
{% if n|has_key:"Asset Name Use Type" %}
{% if n|val_from_key:"Asset Name Use Type"|val_from_key:"@display_value" == "Primary" %}
<h2>{{ n|val_from_key:"Asset Name"|val_from_key:"@display_value" }}</h2>
{% endif %}
{% endif %}
{% endfor %}
{% endif%}
<p><strong>Primary Reference Number: </strong>{{ resource_data|val_from_key:"System Reference Numbers"|val_from_key:"PrimaryReferenceNumber"|val_from_key:"Primary Reference Number"|val_from_key:"@display_value" }}<br><strong>ResourceID: </strong>{{ resource_data.resourceinstanceid }}
</p></blockquote></section><section><divclass="container location-details">
{% if resource_data|has_key:"Location Data" %}
<divclass="row"><divclass="column"><div><h3>OSGB Reference</h3><p>{% if resource_data|val_from_key:"Location Data"|has_key:"National Grid References" %}
{{ resource_data|val_from_key:"Location Data"|val_from_key:"National Grid References"|val_from_key:"National Grid Reference"|val_from_key:"@display_value" }}
{% endif %}
</p></div></div><divclass="column"><divclass="keeptogether"></div><divclass="keeptogether"><div><h3>Named Location</h3><p>{% if resource_data|val_from_key:"Location Data"|has_key:"Addresses" %}
{% for address in resource_data|val_from_key:"Location Data"|val_from_key:"Addresses"|val_from_key:"@display_value" %}
{% if address|has_key:"Address Status" %}
{% if address|val_from_key:"Address Status"|val_from_key:"@display_value" == "Primary" %}
{{ address|val_from_key:"Full Address"|val_from_key:"@display_value" }}
{% endif %}
{% endif %}
{% endfor %}
{% endif %}
</p></div></div></div><divclass="column"><divclass="keeptogether"></div><divclass="keeptogether"><div><h3>Localities/Administrative Areas</h3><p>{% if resource_data|val_from_key:"Location Data"|has_key:"Localities/Administrative Areas" %}
{% for area in resource_data|val_from_key:"Location Data"|val_from_key:"Localities/Administrative Areas" %}
<b>{{ area|val_from_key:"Area Type"|val_from_key:"@value" }}:</b> {{ area|val_from_key:"Area Names"|val_from_key:"Area Name"|val_from_key:"@display_value" }}
{% endfor %}
{% endif %}
</p></div></div></div></div>
{% endif %}
</div></section><hr><section><divclass="container">
{% if resource_data|has_key:"Descriptions" %}
<divclass="keeptogether">
{% for desc in resource_data|val_from_key:"Descriptions" %}
<h3>{{ desc|val_from_key:"Description Type"|val_from_key:"@display_value" }}</h3><p>{{ desc|val_from_key:"Description"|val_from_key:"@display_value" }}</p>
{% endfor %}
</div>
{% endif %}
{% if resource_data|has_key:"External Cross References" %}
<divclass="keeptogether"><h3>External Cross References</h3><divclass="rtable"><divclass="rrow rheader"><divclass="rcell">Number</div><divclass="rcell">Description</div><divclass="rcell">Source</div></div>
{% for src in resource_data|val_from_key:"External Cross References" %}
<divclass="rrow"><divclass="rcell"data-title="Number">{{ src|val_from_key:"External Cross Reference Number"|val_from_key:"@display_value" }}</div><divclass="rcell"data-title="Description">
{% if src|has_key:"External Cross Reference Notes" %}
{{ src|val_from_key:"External Cross Reference Notes"|val_from_key:"External Cross Reference Description"|val_from_key:"@display_value" }}
{% endif %}
</div><divclass="rcell"data-title="Source">{{ src|val_from_key:"External Cross Reference Source"|val_from_key:"@display_value" }}</div></div>
{% endfor %}
</div></div>
{% endif %}
</div></section>
{% endwith%}
{% endfor %}
</main></body></html>
As of version 7.5, Arches can be configured to meet WCAG defined AA level accessibility requirements for all public-facing user interfaces. Please review the documentation on activating the Arches Accessibility Mode.
Please continue reading below to understand how to better meet accessibility requirements for your customization of Arches.
It is important that Arches is developed with inclusivity in mind by making it accessible to users with disabilities.
In a number of regions, organisations are required to ensure that any software they use, or provide as a service, is accessible for users with disabilities. To this end, any UI development within Arches must take measures to conform to the guidance set out in the WCAG 2.1 requirements. This will allow Arches to be more easily adopted by such organisations and provide benefits to a wider audience.
The following information details the minimum steps required to adhere to WCAG accessibility guidelines. Although the remit has been to adhere to AA standards, wherever possible AAA has been reached for issues such as color contrast.
Although many files have been worked on for the many different requirements, there have been some frequently identified issues. Here are the commonly found problems:
Using a combination of the Wave browser extension and the Contrast Checker website mentioned above, you can identify what elements on a page that need changing, for example, from the Arches v5 demo site, take the “Resource Type” button on the search page:
It has a background color #579DDB and a foreground color #FFFFFF - this fails the contrast test. You can use the contrast checker to test how things look when you lighten or darken either the background or foreground. In this instance, using the slider, let’s darken the background color to be #1E5A8F instead, which passes WCAG AAA.
Sometimes it may suit design purposes not to have a label and make use of placeholder text. This is fine, but be mindful that users using screen readers will not get placeholder text read out to them. So we can make use of the aria-label attribute:
Also, you can use the aria-label attribute on a container element to describe the content within:
<divclass="container"aria-label="Search buttons to filter the search results"><buttonid="filterBtn">Filters</button><buttonid="typeBtn">Type</button></div>
Make sure that all headings are ordered and nested correctly. There should only be one <h1> tag per page, and be sure to not skip any heading levels. The correct order should be something like this:
UI development must ensure the website/page is still navigable and actionable via the keyboard. There may be instances where click events are required on elements other than href links, for example (using Knockout binding):
<divclass="css-class"data-bind="click: function() {myFunc();}">
Some content
</div>
This will listen for a mouse click on the div element, but this won’t work if a user is using their keyboard to navigate and operate the website. A keyboard user will not be able to tab to this element or be able to action it by pressing their space bar or enter key. To facilitate this, we need to make it tabbable and actionable via a keypress as follows:
<divclass="css-class"tabindex="0"data-bind="click: function() {myFunc();}"onkeypress="$(this).trigger('click');">
Some content
</div>
Note the use of tabindex="0" which includes the element within the natural DOM tab order and the onkeypress which in this example uses jQuery to force a click. There may be several ways to achieve this but always ensure any clickable element can also be actioned using a keyboard, usually the enter key once tabbed to.
When designing websites, we must think about all users and not for example, only desktop or laptop users with large screens. Users with visual impairment may increase the font size or spacing, or possibly the screen resolution may be lower.
By developing a responsive application, users making these adjustments will benefit from the application adjusting correctly to it. The application will also benefit from this by being available on tablets and mobile devices and in some regions, mobile phones are peoples’ only computing device.
The website should offer the same functionality whether viewing on a large monitor or mobile screen and anything in between so that we can be as inclusive as possible. If certain information cannot be viewed on a smaller screens, then a suitable alternative should be presented to the user.
Arches uses the javascript library called Bootstrap which enables the content to be rendered in a grid system that can be adapted to suit varying screen sizes and types, including mobiles and tablets. No content should appear ‘cut-off’ when reducing the screen width; it should either stack, wrap or be presented differently.
This can easily be tested in a browser such as Chrome or Firefox which have built in developer tools for viewing at different devices or screen widths. Of course the ultimate test would be to use an actual device to see what happens in the real world. For this level of testing I would recommend Browserstack which has access to many different physical devices and browsers.
It’s also good practice to ensure that web pages operate the same using different web browsers. For example, some things may not work correctly in Safari or Chrome, but everything seems fine in Firefox.
Any rendered html needs to pass W3C HTML Validator tests. With any dynamically produced web page, it’s easy to load the page in a browser and view the source, copy and paste into the ‘Validate by direct input’ form field, run the test and work on any errors as necessary.
Here are some common issues found:
Empty id and class attributes, like id="" and class="" - if they’re empty remove them
Incorrect html markup, like having a div tag inside a span tag
Incorrect html5 semantic markup - for example no landmarks, no header, no main, no footer etc
On some pages, the first code on a page contains the open source copyright comment, which is acceptable and required by the GNU Affero General Public License, but sometimes the comment is duplicated causing a validation error
Always be mindful of users that require to use screen readers and check how sections of the page are read out and in what order.
For desktop checks, use the NVDA application to identify possible changes or where to include some aria-label descriptive text to assist with the content visualisation.
Mobile devices have some built in screen reader technology. For iOS it’s called Voice Over and can be accessed under Settings>Accessibility. For Android devices it’s called Screen Reader and can be accessed via Settings>Accessibility>Screenreader.
For example, when viewing a web page, one of the first things read out may be the menu. If the menu has many items, this could become a tedious activity, so it’s good practice to include a “Skip to main content” link that appears when a user first presses the tab button. Pressing enter should change focus to the start of the main content, bypassing the menu items.
Alternative solutions where components cannot be made accessible#
In the event that a specific component cannot be made fully accessible, an alternative method of achieving the same outcome should be provided.
For example, if using an SVG canvas type library to display information or provide a search function, a tabular alternative could also be created that provides the same function.
Ideally, the accessible solution would be the primary solution.
There are many more WCAG guidelines that need to be adhered to but these mentioned here are among the most common. It’s always good practice to have these points in mind whenever creating web pages/content. Always keep in mind how a keyboard-only user would be able to interact with pages and how they would still work on smaller devices such as tablets or mobiles.
Even though your targeted users may not be using mobile devices, you have to cater for every need. In this day and age, the “Mobile first” principle should be used and play a significant role in any product design/development work.
If you want to support localization in your Arches instance, you’ll first need to do the following:
Update your settings.py file by adding this import statement at the top:
fromdjango.utils.translationimportgettext_lazyas_
Next copy the MIDDLEWARE setting to your project’s settings.py file. If it’s already in your settings.py file, be sure to uncomment `"django.middleware.locale.LocaleMiddleware"`
Next add the LANGUAGE_CODE, LANGUAGES, and SHOW_LANGUAGE_SWITCH to your project’s settings.py file and update them to reflect your project’s requirements:
# default language of the application# language code needs to be all lower case with the form:# {langcode}-{regioncode} eg: en, en-gb ....# a list of language codes can be found here http://www.i18nguy.com/unicode/language-identifiers.htmlLANGUAGE_CODE="en"# list of languages to display in the language switcher,# if left empty or with a single entry then the switch won't be displayed# language codes need to be all lower case with the form:# {langcode}-{regioncode} eg: en, en-gb ....# a list of language codes can be found here http://www.i18nguy.com/unicode/language-identifiers.htmlLANGUAGES=[('de',('German')),('en',('English')),('en-gb',('British English')),('es',('Spanish')),]# override this to permenantly display/hide the language switcherSHOW_LANGUAGE_SWITCH=len(LANGUAGES)>1
Now add this import statement to the top of your urls.py file:
fromdjango.conf.urls.i18nimporti18n_patterns
Finally add the following code to the end of your urls.py file:
Once the system is prepared for localization, the next steps involve generating a Django message file or .po file which will contain all available translation strings in Arches and how they should be translated in any given language.
There are some example commands to make and load PO files in the core arches settings file that can be found here. If loading a new PO file, simply replace the existing po file and run compilemessages.
By default, every language from the LANGUAGES array in settings.py is available for business data entry.
To add additional languages for business data entry only, you can do the following.
Access the admin page (http://localhost:8000/admin/)
Choose the “Languages” table. (http://localhost:8000/models/language)
Select “Add Language”
Fill in information on new language, including a default direction.
Repeat this process for all new languages you wish to add.
Additionally, remove any languages you do not plan on using.
Once this is complete, text widgets should be able to write data in the desired languages.
Business data can be exported in RDF format. The directionality of the string data will be lost as
the RDF specification does not include directionality. There is an
active attempt to include direction within the
RDF specification.
It is possible to import and export localized business data through CSV format. There is a --language
switch that will limit the languages that will be exported (all languages are exported by default).
However, if attempting to re-import a limited subset of languages through the csv importer, entire
string objects will be overwritten by the subset. For example, if a string node has values for
English, Spanish, and French, the subset of languages can be limited by specifying
--languagesen,es
If attempting to import the resulting csv, any values that were pre-existing for French would be
overwritten in “overwrite” mode or added as a separate tile in “append” mode. There is currently
no way to merge these values. If the intention is to re-import the csv values later, export all
languages.
Arches is configured to use Cantaloupe if you want to host images made available via the IIIF presentation API. Below is a simplified setup guide. The full Cantaloupe setup documentation is here
Download and extract/unzip the cantaloupe source code from among these releases . We recommend the latest release of version 4.
In a directory containing all the contents of the downloaded source code, make a copy of cantaloupe.properties.sample and name it cantaloupe.properties. When hosting images locally (relative to your arches project), change the value for argument: FilesystemSource.BasicLookupStrategy.path_prefix to the asbolute path of wherever your uploaded files are located, for example /home/ubuntu/project/project/uploadedfiles/.
“Lookup Strategy” should already be set to “BasicLookupStrategy”.
Note
Other strategies (such as delegation) can be configured depending on your desired implementation.
Ensure that the argument CANTALOUPE_DIR in your project’s settings.py file is os.path.join(APP_ROOT,"uploadedfiles") if your project’s uploadedfiles directory is where images will be stored, otherwise point to the appropriate location.
Run the Cantaloupe server (either using the java command or some service or process manager; see the “Running” section of Cantaloupe docs)
Note
Remote hosting of Cantaloupe server, the manifest.json files, and image files are all still in development.
The IIIF Manifests each represent a collection of at least one image (called a “canvas”). It is called an Image “Service” because the cantaloupe server enables the user to zoom and dynamically view the image.
Navigate to the Image Service Manager in the Arches UI and select at least one image to create a new service. If you do not see the icon for Image Service Manager in the left-hand navbar, you may need to update the entry in the Plugins table of your database like so:
sudo -u postgres psql -d [test_project] -c “update plugins set config = '{"show":true}' where name = 'Image Service Manager';”
Now that an Image Service (referred to as a “Manifest”) exists, it will be available for any user to create Annotation data. You can edit this Image Service in the Image Service Manager to upload additional image files or add metadata.
When a resource is edited and a tile saved to that card on that model, if the file is an image type (i.e. a .tiff, .tif, .jpg, .jpeg, or .png) a record in the iiif_manifests table in the database will be created pointing to a manifest .json file that will render the image file from cantaloupe into the IIIF Viewer card (see below).
To make use of IIIF imagery, a resource model must have a semantic node configured to use the “IIIF Card” selected for “Card Type”.
Inside this card/nodegroup, add a child node and select “annotation” datatype. To include other data along with this annotation, (e.g. text, date, or related resources) create sibling nodes of those datatypes, ensuring they are still the children of the semantic node designated with the “IIIF Card”.
When creating a tile for this card in the resource editor, the user will first be prompted to select a IIIF Manifest from a dropdown list. You should see any IIIF Manifests created from the above process.
Note
A single tile for a IIIF card could contain multiple features (point, line, polygon) as part of the annotation data, but commonly you would also want nodes of other datatypes (for ex: string) grouped into this IIIF card; thus to make multiple tiles with different values on the same resource instance, you need to check “Allow Multiple Values” on the IIIF card in the Card Manager.
A dropdown list provides users with options for selecting between IIIF manifests when they use the resource editor. A user can also add a new IIIF manifest that exists on a remote server by pasting the URL to that manifest (the URL will point directly to the remote server’s manifest JSON resource) into the input/search box of the dropdown list. See the animation below for an illustration:
One can use SQL to pre-populate the list of IIIF Manifests. The following SQL inserts will pre-populate the IIIF manifest dropdown list:
insertintoiiif_manifests(label,url,description)values('IIIF Manifest of Gospel Book','https://media.getty.edu/iiif/manifest/a628a212-a325-406c-aa4d-c43eeb393ec5','accession number: 83.MB.69, TMS ID: 1571, UUID: 8c6116d5-09f6-4416-8d15-1804c9337c65');insertintoiiif_manifests(label,url,description)values('IIIF Manifest of Saint Matthew Seated','https://media.getty.edu/iiif/manifest/028b269e-054f-4d39-83b9-6b207707731d','accession number: 83.MB.69.9v, TMS ID: 3275, UUID: 4093369e-678b-41fc-a7e9-a5fef60c7385');insertintoiiif_manifests(label,url,description)values('IIIF Manifest of The Transfiguration','https://media.getty.edu/iiif/manifest/a91a88a3-ca07-480f-b749-8e1c28d4f040','TMS ID: 3278, UUID: 601d907b-2941-4724-9f14-7b7d22f2be63');
Task management using Celery is available in Arches if you have a message broker like RabbitMQ or Redis. The Celery documentation
provides information on broker installation.
Once you have your broker available, you will need to configure your settings. You will find this in your project’s
settings.py file. Each setting that begins with the CELERY prefix will be used as a celery config, so you can configure
celery by adding the configs you need
CELERY_BROKER_URL='amqp://guest:guest@localhost'CELERY_RESULT_BACKEND='django-db'# Use 'django-cache' if you want to use your cache as your backend
The settings you are likely to want to modify right away are the CELERY_BROKER_URL and the CELERY_RESULT_BACKEND.
Your CELERY_BROKER_URL should point to your broker’s service URL. If you are using RabbitMQ, you will probably want to
create a new user and password and replace ‘guest:guest’ with the new users credentials.
Your CELERY_RESULT_BACKEND is set to the Django ORM by default. If your task will be run frequently, you may want to
consider a more performant option. If you’ve configured a cache for django, you could use that, or you could use another
backend option that Celery supports.
To add additional tasks you your project, all you need to do is add a tasks.py file to your project. This could be placed
in your project’s root directory (next to manage.py) or any sub-directory. However, you will likely want to put it - at
least to start in the directory below root (next to urls.py).
Here’s an example of a very simple task in tasks.py:
For your tasks to run both your broker service and a Celery worker need to be running. For production you will likely
want to run a worker as service. One way to do this is to use supervisord. For more information see: Setting up Supervisord for Celery
However for development, you probably want to run your worker in a terminal. To do so just cd into your project’s root
directory with your virtual environment activated and run:
pythonmanage.pycelerystart
Users of Apple Silicon Macs may encounter billiard.exceptions.WorkerLostError following a warning
emitted by the OS explaining it is “crashing instead” rather than unsafely calling fork(). Until this
issue is resolved by celery, launching like so will
silence the errors:
Note that in the event of celery errors which do not clearly indicate what is breaking or preventing your tasks from succeeding, we recommend running the following command instead for the purpose of debugging and isolating any unhandled exceptions:
celery-A[app_name]worker--loglevel=warning
Once the worker is running you should be able check your database and see you task result output with the following
query:
select*fromdjango_celery_results_taskresult;
If you want to monitor your tasks with a realtime console, you can use Flower.
Two-factor authentication is an extra layer of security designed to ensure that you’re the only person who can access your Arches account, even if someone knows your password.
Two-factor Authentication is the technical term for the process of requiring a user to verify their identity in two unique ways before they are granted access to the system.
Users typically rely on authentication systems that require them to provide a unique identifier such as an email address or username and a correct password to gain access to the system.
Two-factor Authentication extends this by adding an additional step that requires the user to enter a one-time dynamically generated token that has been delivered through a secondary method that presumably only the user has access to.
This token is randomly generated and lasts a brief period of time before changing. It is based on an encrypted secret key that is stored in the application and secondary system ( eg. smartphone ).
Two-factor Authentication gives the user and system administrator a peace of mind that even if the user’s password is compromised,
the account cannot be accessed without also knowing the dynamically generated one-time password.
There are two configurable settings, ENABLE_TWO_FACTOR_AUTHENTICATION and FORCE_TWO_FACTOR_AUTHENTICATION. Each accepts a value of True or False.
ENABLE_TWO_FACTOR_AUTHENTICATION - Allows users to enable two-factor authentication via their UserProfile, and redirects login of users that have enabled two-factor authentication to secondary credentials page.
FORCE_TWO_FACTOR_AUTHENTICATION - Must have ENABLE_TWO_FACTOR_AUTHENTICATION enabled. Forces all users to log in with two-factor authentication credentials.
Note
ENABLE_TWO_FACTOR_AUTHENTICATION and FORCE_TWO_FACTOR_AUTHENTICATION do not trigger any other actions, such as terminating user sessions.
Setting up Two-factor Authentication for User Accounts#
If ENABLE_TWO_FACTOR_AUTHENTICATION or FORCE_TWO_FACTOR_AUTHENTICATION have been enabled in your Arches application, users can check the status of their accounts in the User Profile page.
User Profile showing two-factor authentication status.#
From User Profile Edit page, Users can send an email to their registered email address containing instructions and a link to set up two-factor authentication.
User Profile showing two-factor authentication reset email interaction.#
Note
In order to continue, the User should already have access to a means of secondary authentication.
This is done with an external application, usually with Google Authenticator,
Authy, LastPass Authenticator, or any other authentication application.
Following the email link, the user will navigate to the two-factor authentication settings page.
From this page, Users can generate a QR code to be scanned with an external authentication application, or a secret key to be entered manually. This secret is used to generate time-based authentication tokens.
Two-factor authentication settings page showing QR code.#
Once the user has enabled two-factor authentication, or if FORCE_TWO_FACTOR_AUTHENTICATION has been enabled at the system level, the user will be presented with an additional step in the login process. Once the six-digit authentication code has been entered, the User will be logged in.
In the following guides, you’ll see mention of “v4”. However, all of these steps work for Arches v5, as well.
Upgrading your Arches installation is a complex process, as a significant backend redesign was implemented in v4. We have developed the following documentation (and the code to support it) to guide you through the process. You will be performing a combination of shell commands and basic file manipulation.
Before migrating data, you’ll need to install core Arches and create a new project. You can name your project whatever you want, but throughout this documentation we’ll refer to it as my_project. You can customize the templates and images in your project any time (before or after migrating the data). We recommend adding a Mapbox key right away so you can use the map for visual checks during the migration.
You must export all of your data from v3. Before you begin, however, you’ll need to install some enhanced commands into your v3 app. This is a simple process:
You will get a console update during the process, which could take a few minutes. The result will be one file:
v3resources-all-<date>.json
Place the file(s) somewhere easy to access.
Important
If you have a very large database (maybe 25k+ resources), we recommend using --formatJSONL. This will create a JSON Lines file, which requires minimal memory resources. Exporting the entire database to a single JSON file can crash servers without enough memory. For even more control over the export, add --split to the command above. One JSON/JSONL file will be created per resource type. This is extremely helpful for debugging migration issues.
Place the file somewhere easy to access. This is the “Arches” scheme from your RDM, which is, typically, where your entire concept set will exist. If you are using a different concept scheme, subsitute its name for “Arches” in the command above.
Warning
You are only able to migrate one scheme. If your v3 dropdown lists are composed of concepts from two different schemes (i.e. you added another scheme alongside “Arches”, added concepts to it, and then added those concepts to dropdown lists) you’ll need to manually consolidate these schemes into one before exporting.
Dropdown Lists themselves are not migrated, they are recreated in v4 based on Top Concepts.
You must move all of the media files that have been uploaded to your v3 deployment to your v4 project.
By default, the directory in your new v4 project should be called my_project/my_project/uploadedfiles. If this directory doesn’t exist, create it, and move all of the v3 media into it.
AWS S3 and Azure Users
You should be able to continue using the same storage bucket, and just point your v4 project at it. Just make sure your content is in a folder called uploadedfiles. In theory this should work, but we haven’t tested it.
Now that you have exported all of the data you need from your v3 deployment, head back to Migrating Your Data.
After you have all the v3 data exported, you are ready to follow the appropriate workflow for your deployment.
If your v3 deployment of Arches was based on Arches-HIP, and you did not modify any of the graphs (beyond perhaps changing node names) you can use the Arches-HIP Workflow. If you have changed the RDM content that’s fine, it will be preserved through the migration.
If you want, you can rename the directory. For this tutorial, we will rename it
from arches-v4-hip-pkg-master to simply pkg. Really, you can name it whatever you want.
Now go into your project’s my_project/my_project/settings.py file and add this new line, which points
to this new package, somewhere after the APP_ROOT line:
You can actually place the package wherever you want, as long as PACKAGE_DIR
holds the path to it. You can even leave out this setting entirely if you pass --targetpath/to/package
to all of the v3 commands that are used later in this process.
Finally, load this package into your project:
pythonmanage.pypackages-oload_package-spkg-dbtrue
Important
We recommend using the -dbtrue flag here, which will completely erase your v4 project database
and create a fresh installation. If you have already added a lot of new user logins to your v4 project,
these will be lost. If you have already added settings to your project like a MapBox API key, for example,
follow these steps to retain them before running the command with -dbtrue:
In your v4 project, run pythonmanage.pypackages-osave_system_settings
Find the newly created file my_project/my_project/system_settings/System_Settings.json and move it into my_project/pkg/system_settings.
When you do run the load package command, say “y” to the prompt about overwriting project settings (they will be imported from this new settings file).
Before moving on you should be able to view your project in a browser, login with the default admin/admin credentials,
and go to the Arches Designer to confirm that you have all six Arches-HIP Resource Models loaded. There should be no
resources in your database yet.
Move v3resources-all-<date>.json from Export v3 Business Data into v3data/business_data. This file name could be slightly different for you (or you may have multiple files) based on how you ran the v3 export.
Now you are ready to convert and import your v3 data:
pythonmanage.pyv3write-v4-json
This command will create new v4 resource JSON/JSONL files in pkg/business_data, one per Resource Model.
You’ll be provided with easy copy/paste commands to load the files if you want, or you can add -i/--import
to the command to load the resources immediately.
To help you debug any errors you encounter, and generally give you more control over this command, we’ve provided a
number of optional arguments.
-i, --import
Directly imports the resources after the v4 JSON/JSONL file is created.
-m, --resource-models
List the names of resource models to process, by default all are used.
-n, --number
Limits the number of resources to load: -n10 will only load the first 10 resources of each resource model.
--exclude
List of resource ids (uuids) to exclude from the write process.
--only
Specify one or more resource ids to process. All other resources will be ignored.
--skipfilecheck
Skip the check for uploaded image files that are referenced in v3 business data. Only applicable if you are converting resources with images attached to them.
--verbose
Enables verbose printing during the process. Generally not recommended, it’s very verbose.
will only write the first 100 “Activity” resources to v4 JSON (even if there are more Resource Models in your package), excluding a single resource whose id is 08b68d46-c202-458a-bf11-bc7a1dd5b2ef, and will then immediately import these resources into your database.
Once you have all of your resources loaded in your database, you can import the resource relations from v3. Use:
pythonmanage.pyv3write-v4-relations
to write the file, and add -i/--import to directly import them. You will likely get errors if you try to
import resource relations but have not loaded all of your business data.
You can now treat this package just as you would any other v4 package, by adding custom functions, map layers, etc.
You can also safely remove the v3data directory if you wish, as those files will no longer be used (generally it
is good to retain that sort of data somewhere though).
If you have a v3 deployment with custom resource graphs, you’ll need to use the following workflow. Be aware, you’ll need to remake your custom resource graphs in v4 (as “Resource Models”). This is listed as Step 6 below.
Experienced developers should be able to use some of these steps individually to accomplish discrete tasks, but we generally recommend following this workflow as a whole.
Note
All of the commands below must be run from within your v4 project.
You can actually name your new package whatever you want, and place it wherever you want, as long as PACKAGE_DIR holds the path to it. You can even omit PACKAGE_DIR entirely if you pass --targetpath/to/package to all of the v3 commands below.
Finally, load this package into your project:
pythonmanage.pypackages-oload_package-spkg-dbtrue
Important
We recommend using the -dbtrue flag here, which will completely erase your v4 project database
and create a fresh installation. If you have already added a lot of new user logins to your v4 project,
these will be lost. If you have already added settings to your project like a MapBox API key, for example,
follow these steps to retain them before running the command with -dbtrue:
In your v4 project, run pythonmanage.pypackages-osave_system_settings
Find the newly created file my_project/my_project/system_settings/System_Settings.json and move it into my_project/pkg/system_settings.
When you do run the load package command, say “y” to the prompt about overwriting project settings (they will be imported from this new settings file).
Before moving on you should be able to view your project in a browser and login with the default admin/admin credentials.
Move v3resources-all-<date>.json from Export v3 Business Data into v3data/business_data. This file name could be slightly different for you (or you may have multiple files) based on how you ran the v3 export.
4. Move the v3 resource graph _nodes.csv files from v3 into your package.#
In your Arches v3 deployment, you should be able to find these files in your original source_data/resource_graphs directory, whose contents should be a _edges.csv and _nodes.csv for every resource graph in your database. We only want the _nodes.csv files.
Move the _nodes.csv files into v3data/graph_data.
After completing steps 3 and 4, your v4 package should look like this:
Now that the v3 reference data has been converted and loaded, you are ready to create the v4 Resource Models. This migration process does not attempt to create them based on your old v3 graphs. There are a number of reasons for this, but most simply, v4 graphs have different constraints and support different datatypes and structures than those in v3. In other words, your v4 database will be better off with graphs that have been created natively, not translated from v3.
Generally, we would expect the v4 graphs to look like their v3 analogs, but we have built in quite a bit of wiggle room:
The graph names can differ
The node names can differ
The graph structure can differ (though maintaining the same general branching structure is advisable)
However, there must still be a one-to-one relationship between v3 and v4 graphs and their nodes.
When it comes to node datatypes, the translation from v3 to v4 is pretty straight-forward.
concept-list - if multiple values per v3 branch were allowed
Important
When you set a v4 node to concept or concept-list, you will need to select which collection to use. This is why it’s best to have migrated and loaded your RDM scheme (step 5 above) before making the Resource Models.
See also
Refer to Designing the Database for help on this task. Within the Arches Designer itself, click for detailed help on each page.
Once you have built all of the Resource Models, export them into your package. You can do this one-by-one from the Arches Designer interface, or use:
If you have made any Branches, using the -g"all" argument will export them as well, which you don’t want. You’ll have to remove them from pkg/graph/resource_models and/or move them into pkg/graph/branches before moving on.
By the end of this step, you should have one JSON file per Resource Model in pkg/graphs/resource_models.
which will create v3data/rm_configs.json. This file will be used to link the name of your v4 Resource Models with the names of their corresponding v3 graphs, as well as point to the files that link each node. Initially its content will look like:
where "Activity" is the name of a v4 Resource Model. As the file says, you must now fill out the v3_entitytypeid value for all items. Typically, this will look something like "ACTIVITY.E7"–upper-case with a CRM class appended to it.
Now, also as the file says, run:
pythonmanage.pyv3generate-lookups
and you’ll see the rest of the values get filled out.
There will now be more CSV files in the v3data/graph_data directory. There is one per v3 graph, and they are used to match the names of v3 node names (column one), with v4 node names (column two). All of the v3 nodes will be listed for you, but you have to fill out the v4 node names manually, using your new Resource Models for reference. A portion of a filled out file could look like:
This section provides information on how to deploy Arches on a production server (including cloud hosted deployments), how to configure and manage the server, and how to manage the data and the application.
Running Arches in “production”, as a tool for use by members of your organization or as an information source available to the public on the Web, has a few more requirements and considerations than running Arches privately on your own device.
This section documents how to set up Arches for more secure, more reliable, and more scalable production deployments. You may also choose to review documentation about Installation with Docker, because that also has a section about using Docker for production deployments.
This guide will walk you through the steps necessary to deploy Arches in a production environment. This guide assumes that you have already installed Arches and have a working Arches installation. If you have not yet installed Arches, please see the Installing Core Arches. We recommend review of Django’s recommended checklist for production deployments in order to better understand how to deploy your Arches instances in production environments.
Most importantly, you should never run Arches in production with DEBUG=True. Open your settings.py file (or settings_local.py) and set DEBUG=False (just add that line if necessary).
Turning off the Django debug mode will:
Suppress the verbose Django error messages in favor of a standard 404 or 500 error page.
You will now find Django error messages printed in your arches.log file.
Important
Make sure you have 500.htm and 404.htm files in your project’s templates directory!
Cause Django to stop serving static files.
You must set up a real webserver, like Apache or Nginx, to serve your app. See Serving Arches with Apache.
Add Allowed Hosts and CSRF Trusted Origins to Settings#
ALLOWED_HOSTS acts as a critical safeguard against HTTP Host header attacks, ensuring that your Arches application only responds to valid hostnames. On the other hand, CSRF_TRUSTED_ORIGINS is instrumental in fortifying your application against Cross-Site Request Forgery (CSRF) attacks by specifying trusted origins for the submission of forms. Both of these settings are required for Arches to work properly in production. These settings are described in more detail in the Django documentation.
Allowed Hosts: In settings.py (sometimes set via settings_local.py) you will need to add multiple items to the list of ALLOWED_HOSTS. Consider the following example:
In that example, “my-arches-site.org” is the public domain name. But the items “localhost”, “127.0.0.1” are all local network locations where Arches is deployed. You may need all of these for Arches to work properly.
CSRF Trusted Origins: Django 4.0, a dependency of Arches 7.5 introduced a new setting for security purposes. In the settings.py (sometimes set via settings_local.py) you will need to add multiple items to the list of CSRF_TRUSTED_ORIGINS. If you don’t include this, users will encounter CSRF error (403) then they attempt to login. See the Django documentation for details. Note the following items (with the https:// prefix):
In deploying Arches in production, have a choice in how you bundle frontend assets (CSS, Javascript, etc).
You can use yarnbuild_development followed by manage.pycollectstatic to provide unminified frontend bundles.
These will be larger files, so there will be a hit with respect to network performance.
Alternatively, you can build production assets for the frontend, which will be minified and therefore faster for
clients to download. To make production frontend assets, use the manage.pybuild_production management command
(this combines both yarnbuild_production and manage.pycollectstatic). Please note however, you will need
at least 8GB of RAM for the production frontend asset build itself (and much more if you’re also running the
database and backend Arches server on the same host), and you will need lots of time. Depending on your system
specifics, this can take multiple hours to complete.
During development, it’s easiest to use the Django webserver to view your Arches installation. However, once you are ready to put the project into production, you’ll have to use a more efficient, robust, and secure webserver like Apache or Nginx.
Use of Apache or Nginx involves many considerations in common, including set-up of SSL certificates for HTTPS, set-up and permissions of static assets, and running the Arches Django application with a WSGI server. The following guide first two sections describes how to use Apache. The next section focuses on using Nginx:
The following instructions work for Ubuntu 16.04 - 20.04; minor changes may be necessary for a different OS. This is a very basic Apache configuration, and more fine tuning
will benefit your production installation.
Install Apache.
$sudoapt-getinstallapache2
Install mod_wsgi
There are two ways to install mod_wsgi. Both of the require you to start by installing the Apache and Python development headers.
$sudoaptinstallapache2-devpython3-dev
Note
You may need to install the Python dev package specific to your Python version, e.g. python3.10-dev.
Now follow one of the following two options:
Install mod_wsgi directly into your Python virtual environment
Copy these two lines, you will use them in step 3.
Install mod_wsgi system-wide
Alternatively, you can use apt to install at the system level:
$sudoaptinstalllibapache2-mod-wsgi-py3
Note that the version of Python 3 installed at the system-level may need to match the version used to create the virtual environment pointed to in the config.
For example, if libapache2-mod-wsgi-py3 is compiled against Python 3.10, use Python 3.10 for your virtual environment.
Installing mod-wsgi this way means you will not need to load it as a module in the Apaache .conf file.
Create a new Apache .conf file
Here is a basic Apache configuration for Arches. If using a domain
like heritage-inventory.org, name this file heritage-inventory.org.conf,
otherwise, use something simple like arches-default.conf.
The paths below are based on an example project in /home/ubuntu/Projects/my_project.
# If you have mod_wsgi installed in your python virtual environment, paste the text generated# by 'mod_wsgi-express module-config' here, *before* the VirtualHost is defined.LoadModulewsgi_module"/home/ubuntu/Projects/ENV/lib/python3.10/site-packages/mod_wsgi/server/mod_wsgi-py37.cpython-37m-x86_64-linux-gnu.so"WSGIPythonHome"/home/ubuntu/Projects/ENV"<VirtualHost*:80>WSGIApplicationGroup%{GLOBAL}WSGIDaemonProcessarchespython-path=/home/ubuntu/Projects/my_projectWSGIScriptAlias//home/ubuntu/Projects/my_project/my_project/wsgi.pyprocess-group=arches# May be necessary to support integration with possible 3rd party mobile appsWSGIPassAuthorizationon## Uncomment the ServerName directive and fill it with your domain## or subdomain if/when you have your DNS records configured.# ServerName heritage-inventory.org<Directory/home/ubuntu/Projects/my_project/>Requireallgranted</Directory># This section tells Apache where to find static files. This example uses# STATIC_URL = '/media/' and STATIC_ROOT = os.path.join(APP_ROOT, 'static')# NOTE: omit this section if you are using S3 to serve static files.Alias/media//home/ubuntu/Projects/my_project/my_project/static/<Directory/home/ubuntu/Projects/my_project/my_project/static/>Requireallgranted</Directory># This section tells Apache where to find uploaded files. This example uses# MEDIA_URL = '/files/' and MEDIA_ROOT = os.path.join(APP_ROOT)# NOTE: omit this section if you are using S3 for uploaded mediaAlias/files/uploadedfiles//home/ubuntu/Projects/my_project/my_project/uploadedfiles/<Directory/home/ubuntu/Projects/my_project/my_project/uploadedfiles/>Requireallgranted</Directory>ServerAdminwebmaster@localhostDocumentRoot/var/www/html# Available loglevels: trace8, ..., trace1, debug, info, notice, warn,# error, crit, alert, emerg.# It is also possible to configure the loglevel for particular# modules, e.g.#LogLevel info ssl:warn# Recommend changing these file names if you have multiple arches# installations on the same server.ErrorLog/var/log/apache2/error-arches.logCustomLog/var/log/apache2/access-arches.logcombined</VirtualHost>
Disable the default Apache conf, and enable the new one.
Replace arches-default with the name of your new .conf file if needed.
At this point, you can try accessing your Arches installation in a browser, but
you’re likely to get some kind of file permissions error. Continue to the next section.
Important
With Apache serving Arches, any changes to a .py file (like settings.py)
will not be reflected until you reload Apache.
Or, if either arches.log or uploadedfiles doesn’t yet exist, you can
just allow www-data to create them at a later point by giving write access
to your project directory.
You should now be able to access your Arches installation in a browser, but
there is one more important step.
Run collectstatic.
This Django command places all of the static files (CSS, JavaScript, etc.)
used in Arches into a single location that a webserver can find. By default,
they are placed in my_project/my_project/static, based on STATIC_ROOT.
Note
You can change STATIC_ROOT all you want, but be sure to update the
Alias and Directory info in the Apache conf accordingly.
(ENV)$ python manage.py collectstatic
The first time this runs it will take a little while (~20k files), and may
show errors/warnings that you can safely ignore.
Finally, make sure Apache has write access to this static directory because
django-compressor needs to update the CACHE contents inside it:
Many Django applications use the open source Nginx application as a proxy server. If you want to use nginx + uWSGI instead of Apache + mod_wsgi, you should start with this tutorial . You can also use Nginx with Gunicorn (an increasingly popular way to securely run a Django application). To use Nginx and Gunicorn, please start with this tutorial.
If you’re using Gunicorn, don’t forget to first install it into the Python virtual environment you are using for Arches:
$# install gunicorn into your Arches virtual environment
$pipinstallgunicorn
As is the case with Apache, Nginx will need appropriate permissions to serve static files. Every time you run collectstatic, you may change the file permissions, and you may need to rerun the following:
It’s sometimes useful to have an example configuration to help get you started. This Nginx configuration can be used as a guide.
Note
The configuration provided below asks Nginx to compress text files (css, javascript, etc). This may help to noticeably improve performance for the Arches user interface.
server_names_hash_bucket_size64;proxy_headers_hash_bucket_size512;server_names_hash_max_size512;large_client_header_buffers864k;proxy_read_timeout3600;proxy_connect_timeout3600;# Connect to the Arches Django app running with Gunicorn.upstreamdjango{serverlocalhost:8000;}# The not encrypted plain HTTP configserver{listen80;charsetutf-8;server_namemy-arches-project.orgwww.my-arches-project.org;location^~/.well-known/acme-challenge/{default_type"text/plain";autoindexon;allowall;root/var/www/certbot/$host;}access_log/logs/nginx/access.log;error_log/logs/nginx/error.log;proxy_read_timeout3600;proxy_set_headerX-Forwarded-Protocol$scheme;gzipon;gzip_disable"msie6";gzip_varyon;gzip_proxiedany;gzip_comp_level6;gzip_buffers168k;gzip_http_version1.1;gzip_typestext/plaintext/cssapplication/jsonapplication/ld+jsonapplication/geo+jsontext/xmlapplication/xmlapplication/xml+rsstext/javascriptapplication/javascripttext/html;# Redirect to use HTTPSlocation/{return301https://$host$request_uri;}}# The encrypted HTTPS configserver{listen443ssl;server_namemy-arches-project.orgwww.my-arches-project.org;access_log/logs/nginx/ssl_access.log;error_log/logs/nginx/ssl_error.log;proxy_set_headerX-Forwarded-Protocol$scheme;proxy_read_timeout3600;ssl_certificate/etc/your-ssl-path/fullchain.pem;ssl_certificate_key/etc/your-ssl-path/privkey.pem;# NOTE! These other config files are not documented hereinclude/etc/nginx/options-ssl-nginx.conf;ssl_dhparam/etc/nginx/sites/ssl/ssl-dhparams.pem;include/etc/nginx/hsts.conf;# NOTE! Be default, NGINX only allows a 1MB file upload.# The following config raises this to 100MBclient_max_body_size100M;# Ask Nginx to use gzip compression to send javascript, css, etc.gzipon;gzip_disable"msie6";gzip_varyon;gzip_proxiedany;gzip_comp_level6;gzip_buffers168k;gzip_http_version1.1;gzip_typestext/plaintext/cssapplication/jsonapplication/ld+jsonapplication/geo+jsontext/xmlapplication/xmlapplication/xml+rsstext/javascriptapplication/javascripttext/html;location^~/.well-known/acme-challenge/{default_type"text/plain";autoindexon;allowall;root/var/www/certbot/$host;}# For the 'alias', use the correct path to the location where Arches# puts static files after 'collectstatic'. Like Apache (see above)# Nginx will also need permissions to serve the static files.location/static/{autoindexon;allowall;alias/path_to_arches_static_files_after_collectstatic/;include/etc/nginx/mime.types;}location@proxy_to_django{proxy_passhttp://django;proxy_http_version1.1;proxy_set_headerUpgrade$http_upgrade;proxy_set_headerConnection"upgrade";proxy_redirectoff;proxy_set_headerHost$host;proxy_set_headerX-Real-IP$remote_addr;proxy_set_headerX-Forwarded-For$proxy_add_x_forwarded_for;proxy_set_headerX-Forwarded-Host$server_name;}}
Secure Sockets Layer (SSL) enables the server to establish an encrypted link with its clients. This is more secure than using unencrypted communication. To implement SSL you will need a digital certificate which can be signed either by you or by a certificate authority. For more information about the SSL certificates please see this article .
Implementing SSL on your server can be divided to two stages:
SSL certificate can be signed either by you using your own private key or by a certificate authority. Each choices has its own prerequisites and consequences.
A good guide about how to implement this using OpenSSL on Ubuntu 20.04 can be found here.
Note
This option allows you to implement SSL using your server’s IP address without a domain name. However, when accessing the website using any modern browser the connection will be marked as not private.
Let’s Encryptis a non-profit certificate authority run by Internet Security Research Group that provides X.509 certificates for Transport Layer Security encryption at no charge. You can obtain a certificate from other certificate authorities too. Please keep in mind that some authorities require a fee for their services.
Note
For this option you will need a domain name to use for your website.
To use the digital certificate in serving your website you need to modify the webserver configuration. You can modify the current configuration file to add the new configuration or create a new configuration file. In this guide we will use one file.
Start by adding the domain as a variable at the top of the file as such
ServerNameyourDomainName
Then modify the current configuration to redirect the requests from port 80 to port 443. You will need to add this code
RewriteEngine On
RewriteCond %{SERVER_PORT} !^443$
RewriteRule ^(.*)$ https://%{HTTP_HOST}$1 [R=301,L]
You can transfer all the configuration related to arches to the new virtual host 443 and change </path/to/your/certificate/> to reflect the location of your certificate. Your file should look like this
ServerName yourDomainName
LoadModule wsgi_module "/home/ubuntu/Projects/ENV/lib/python3.10/site-packages/mod_wsgi/server/mod_wsgi-py37.cpython-37m-x86_64-linux-gnu.so"
WSGIPythonHome "/home/ubuntu/Projects/ENV"
<VirtualHost *:80>
ServerName yourDomainName
ServerAdmin webmaster@localhost
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
# This is optional, in case you want to redirect people
# from http to https automatically.
RewriteEngine On
RewriteCond %{SERVER_PORT} !^443$
RewriteRule ^(.*)$ https://%{HTTP_HOST}$1 [R=301,L]
</VirtualHost>
<VirtualHost *:443>
WSGIPassAuthorization on
WSGIDaemonProcess arches python-path=/home/ubuntu/Projects/my_project
WSGIScriptAlias / /home/ubuntu/Projects/my_project/my_project/wsgi.py process-group=arches
<Directory /home/ubuntu/Projects/my_project/>
Options Indexes FollowSymLinks
AllowOverride None
Require all granted
</Directory>
Alias /media/ /home/ubuntu/Projects/my_project/my_project/static/
<Directory /home/ubuntu/Projects/my_project/my_project/static>
Options Indexes FollowSymLinks
AllowOverride None
Require all granted
</Directory>
Alias /files/uploadedfiles /home/ubuntu/Projects/my_project/my_project/uploadedfiles
<Directory /home/ubuntu/Projects/my_project/my_project/files/uploadedfiles>
Options Indexes FollowSymLinks
AllowOverride None
Require all granted
</Directory>
ServerName yourDomainName
ServerAdmin webmaster@localhost
DocumentRoot /var/www/html
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
SSLEngine on
SSLCertificateFile </path/to/your/certificate/>cert.pem
SSLCertificateKeyFile </path/to/your/certificate/>privkey.pem
SSLCACertificateFile </path/to/your/certificate/>chain.pem
</VirtualHost>
Then you will need to enable the SSL and redirecting modules before you reload apache configuration
sudoa2enmodsslsudoa2enmodrewrite
Now you can reload apache to access the new configuration
Arches uses Celery (https://docs.celeryq.dev/en/stable/getting-started/introduction.html), a Python framework for setting up and managing task queues. Using Celery,
Arches can delegate certain tasks with long execution times to separate processes. In a production deployment, this can enable Arches to delegate big jobs to a
queue so that requests to the Arches application do not lead to timeout or other errors.
This documentation discusses how to enable Celery task management using Supervisord (http://supervisord.org/). Essentially, Supervisord automatically monitors and controls Celery workers, checking to make sure they are operating, and restarting them if they fail.
Arches does not require Supervisord and Celery to run in “production” mode (with DEBUG=False in settings.py). Arches instances managing smaller amounts of data may not need Supervisord and Celery. The deployment scenarios where you should consider using Supervisord and Celery include:
Supervisord and Celery will be required if you want to enable export / bulk download of more than 2000 resource instances.
Supervisord and Celery will be required to enable the Bulk Data Manager plugin to function. (Note: by default, Arches installs this plugin in hidden state.)
Supervisor and Celery Installation and Configuration#
The following is a guide for a linux-based OS; be advised you can change any of the file names, destinations, or permissions to suit your needs.
Supervisor can be installed using in your Arches virtual environment with pip: pipinstallsupervisor.
In the core arches repo, in arches/install/supervisor_celery_setup there exist example files for supervisor, celeryd, and celerybeat. We recommend copying them into the following directory structure:
In the content of the files as well as the filenames themselves, replace the values of the following placeholders:
/absolute/path/to/virtualenv/ - absolute path to your python3 virtualenv
[app] - replace this with the value of ELASTICSEARCH_PREFIX in your project’s settings.py file
/absolute/path/to/my_proj - absolute path to your arches project
my_proj_name - name of your project
Note that you can change the value for user in the -supervisord.conf file to a designated user to run supervisord.
Before proceeding, you will want to make sure that whichever user you designate to run supervisor has the appropriate permissions for the following files:
Once successfully installed (and verified that it has been added to your PATH), start running it with the command rabbitmq-server. For a convenient option, this can be run in a screen. Note that rabbitmq should be run prior to running supervisord. If you choose, you can use Redis as a “broker” instead of RabbitMQ (see below).
Run supervisord-c/etc/supervisor/my_proj_name-supervisord.conf to start the supervisord which will start celery workers for your tasks.
To check and stop your supervisord process, please review the following:
To check on the status of celery (workers): supervisorctl-c/etc/supervisor/my_project-supervisor.confstatus
To restart celery workers: supervisorctl-c/etc/supervisor/my_project-supervisor.confrestartcelery
To stop celery workers: supervisorctl-c/etc/supervisor/my_project-supervisor.confstopcelery
To shut down supervisord (and the celery processes it controls): unlink/tmp/supervisor.sock
Redis can serve as an alternative to RabbitMQ, but it lacks official Windows support. If you are not deploying Arches on Windows, you can use Redis as follows:
Follow the above directions (steps 1 to 5) for setting up supervisored, celery, and their configurations
Install the Python interface to Redis into your Arches virtual environment with pip: pipinstallredis
Configure the CELERY_BROKER_URL (in settings.py or overwritten in settings_local.py): CELERY_BROKER_URL="redis://@localhost:6379/0"
Activate Redis: redis-server
Run supervisord-c/etc/supervisor/my_proj_name-supervisord.conf to start the supervisord and celery workers
Start Arches
Known Issue with Arches Celery Configurations and Celery Beat#
The default configuration files in the conf.d directory discussed above need updating. Version 5 and later of celery has a revised order of arguments in using celery commands (see: archesproject/arches#9202).
The corrected syntax (celery >= v5x) for the command at the top of my_proj_name-celeryd.conf looks like:
/absolute/path/to/virtualenv/bin/celery-Amy_proj_name.celeryworker--loglevel=INFO
The corrected syntax (celery >= v5x) for the command at the top of my_proj_name-celerybeat.conf looks like (all in one line):
/absolute/path/to/virtualenv/bin/celery-Amy_proj_name.celerybeat--schedule=/tmp/celerybeat-schedule--loglevel=INFO--pidfile=/tmp/celerybeat.pid
After fixing the command syntax, the celery worker should function. However, you may still have trouble getting celery beat to work (archesproject/arches#9243). Celery beat schedules periodic tasks (much like a crontab in a Linux operating system) using a Python implementation. In many cases, Arches will function without (evident) problems even if celery beat does not work. However, if you have workarounds or fixes, please let us know!
You may encounter a problem if you use a @reboot cron job to starts up Supervisor as described in the docs. This may lead to connection errors because Celery can’t reach RabbitMQ. One workaround that may help would be to wait a minute or two, and then rerun that same startup command. This will hopefully allow RabbitMQ enough time to be ready to accept connections with Supervisor and Celery.
By the time you are in a production environment, you will have configured Arches
with a web server, such as Apache or nginx. While you need a web server to
serve the app itself, there are two pieces of the app that can be separated from
the web server and served independently. These are the ‘static’ files (the css,
javascript, and logos that are used throughout the app) and the ‘media’ files
(any user uploaded files, such as images or documents).
These static and media files need to be stored someplace accessible via Web (HTTP) requests
made by Arches users. Many of the existing tutorials on this matter are concerned with serving both
static and media files, because the more load you can take off of your web server
the better. However, for the purposes of this tutorial, we are only dealing with
media files. S3 (and other cloud storage services!) are especially suited to storing
a large (and growing) amount of files. For instance:
Cloud storage is cheap: As per the S3 price chart, it costs just $.03 per gb/month. So a database with 10gb of photos will have a media storage cost of $3/month, plus a small amount per transaction ($0.004 per 10,000 GET requests, e.g.). Google Cloud storage has similar costs, as does Azure Cloud storage.
Cloud storage is scalable: You only pay for the amount of data you have stored, and you have no real limit on how much you can store. This allows for an Arches deployment on a small server, either in-house or a small cloud instance (AWS EC2, Google, DigitalOcean, etc.) to store hundreds of gigabytes of media–photos, audio, video, documents–without having to restructure to accommodate more data.
You should be able to use Cloud storage regardless of where your app is hosted, whether on an internal server, an AWS EC2 instance, a DigitalOcean droplet, etc.
Note
We provide specific guidance for integrating Arches with Amazon S3 storage because it is currently popular and familiar to many. However, we want to emphasize that you can choose among different commercial cloud storage services to use with Arches. The S3 integration steps below will give you a general picture on how to use other cloud storage services, but you’ll need to change some specifics. Please refer to django-storages documentation for additional help on integrating with different cloud storage providers.
Note
We’ve found that by following the steps below, deleting an Information Resource
from within Arches will not automatically remove the file from your S3 bucket.
You can manually delete files from the bucket for now, or the intrepid developer
may check out the answer to this question on the Arches forum.
Warning
You may run into some version compatibility issues with Arches, Django, and django-storages. If your version of Arches uses a version of Django that is <3.2, pip installing django-storages will install the latest version of django (incompatible with Arches) and cause your Arches application to break. If you run into this problem, you may need to use pip to reinstall the Arches requirements as specified in the Arches requirements.txt file.
To use S3, you will need an AWS account, which is just an extension of a normal
Amazon account. Here’s some
information on how to get started.
Having worked through a number of existing tutorials (mostly
dylanbfox.blogspot.com,
www.caktusgroup.com,
and www.holovaty.com),
we’ve distilled these steps to show how you can use S3 in conjunction with your Arches
app. Before beginning, you will need to have set up and logged into your AWS account.
Create credentials for your Arches app
These new credentials will allow your Arches app to access the S3 bucket.
Access the AWS Identity and Access Management (IAM) Console.
Create a new user (named something like “arches_media”), and download the new credentials. This will be a small .csv file that includes an Access Key ID and a Secret Key.
Also, go to the new user’s properties, and record the User ARN.
Create a new bucket on S3
Next, you’ll need to create a new bucket and give it the appropriate settings.
Create a bucket, named something like “my_app-media”.
In the new bucket properties, under Permissions, create a new bucket policy
Paste the following text into your new policy, inserting your own BUCKET-NAME and the your new User ARN
In order to configure Arches to use your new bucket, you need to install a couple of extra Django modules in your virtual environment. These will augment Django’s flexibility in how it stores uploaded media.
Activate your virtual environment and run this command
Finally, you need to tell your app to use these new modules, give it the necessary credentials, and tell it where to store (and find) the uploaded media. Open the your settings.py file…
Find the line that defines the settings “INSTALLED_APPS” and add ‘storages’ to it. It should look like this
Next, add the following lines, replacing the AWS settings values with information from earlier steps (remember the credentials.csv file you downloaded?)
You should be good to go! To test, create a new Information Resource in your installation and upload a file. Now go back to check out your S3 bucket through the AWS console. Your file should show up in a new folder called files within the bucket. If you are encountering issues, be sure to let us know on the [forum](https://groups.google.com/forum/#!forum/archesproject).
If you’ve been doing your Arches development work locally you will eventually need
to transfer your app to a remote server of some kind in order for it to be served
through the Internet. This can be done in many different ways, and in this section
we’ll give an introductory explanation of how to use Amazon Web Services (AWS) to
deploy Arches.
AWS includes dizzying array of systems and services. AWS names different computing services using an alphabet soup of (initially) cryptic acronymns. Acronyms mentioned in this documentation include:
AWS:
Amazon Web Services
ALB:
Application Load Balancer (ALB) provides (in this context) a means to manage requests from the outside Internet before they get directed to your EC2 instance running Arches.
EC2:
Amazon Elastic Compute Cloud (EC2) provides virtual servers with different processing speeds, memory, hard drives, and operating systems. You can install your own software (such as Arches) on EC2 instances.
RDS:
Amazon Relational Database Service (RDS) provides Amazon managed database servers (including Postgres servers) that you can use instead of manually installing and managing a database server on an EC2 instance.
S3:
Amazon Simple Storage Service (S3) provides a scalable file storage and hosting infrastructure.
IAM:
Amazon Identity and Access Management (IAM) service sets security and permissions roles and policies for different users, different applications and services, and different computing instances.
While it can be very intimidating to get started, Amazon Web Services are so widely used that you can easily find some excellent guidance and help.
This guidance is primarily intended to provide a basic introduction for simple AWS deployment architectures. Some organizations use AWS for relatively small-scale and simple projects,
and others use AWS to run large-scale and very complicated systems. An AWS deployment architecture may vary widely according to
different security requirements, scales, backup strategies, maintainability needs, and uptime and performance needs.
To help get you started, this documentation focuses on simple initial AWS deployments. An organization should consult with AWS experts in cases where there are significant security, scale, reliability, or performance requirements. Please look elsewhere for guidance on server administration and maintenance.
As noted above, you should carefully align your deployment architecture according to your specific requirements, budget, and proficiency with AWS services. This introduction illustrates just two of a wide variety of architecture options:
A Single Node Deployment (one EC2 instance)
The most simple AWS deployment architecture essentially mimics deployment of Arches that runs as a localhost on your own machine. In this architecture, the Arches application runs on a single EC2 instance along with the dependency Postgres database server and dependency Elasticsearch server. As described in the diagram below, the only other AWS service used outside of this one EC2 instance is S3 (to configure, see: Using AWS S3 or Other Cloud Storage ), for storage for user uploaded files.
A Multiple Node Deployment (two EC2 instances and RDS)
If you require greater performance, you can consider an architecture that uses multiple EC2 instances together with other AWS services, especially the RDS service. For example, you can deploy the Elasticsearch server (an Arches dependency) on a separate EC2 instance. This avoids a scenario where the Arches core application and Elasticsearch compete for the same computing resources. Similarly, you can use the RDS service to provision a Postgres database for Arches, which again more widely distributes computation across a broader infrastructure. Using multiple EC2 instances together with the RDS service may be somewhat more expensive and may involve a bit more configuration and deployment effort, but this architecture will likely have scale and performance advantages. As described in the diagram below, (core) Arches and Elasticsearch each have their own EC2 instances, RDS provisions the Postgres database to Arches, and S3 provides storage for user uploaded files (to configure, see: Using AWS S3 or Other Cloud Storage ).
Arches deployed on multiple EC2 instances together with RDS.#
AWS provides extremely powerful and sophisticated tools to manage permissions and security. AWS emphasizes the management of “roles” and “policies” for security. You typically use the IAM service to set roles and policies that grant specific permissions to individuals or services. A good security practice is to follow “the principle of least privilege”. This principle ensures that entities only have the bare minimum permissions necessary to perform their tasks.
If you are managing sensitive information in Arches (or any other system) you should gain proficiency with AWS security good practices and a good understanding of network architecture. For example, if you deploy Arches using an EC2 instance on a public subnet, SSH access will be more convenient, but it will be less secure than putting the Arches EC2 instance in private subnet. The best choice of security practices and network protections will vary depending on the sensitivity of the information you manage and your operational / administrative needs.
Now that we’ve introduced different considerations for deploying Arches on AWS, we can move into more specifics about how to move a locally hosted Arches instance to a “Single Node Deployment” (see the architecture described above).
Warning
Some content below may be outdated. Both Arches and AWS are evolving systems. If you notice sections that need updating please alert us by submitting a ticket (archesproject/arches-docs#issues)!
A few components must be in place before you are ready to complete these steps.
You will need an AWS account, which is just an extension of a normal Amazon account.In the very beginning, do not worry about pricing; if you are new to AWS, everything listed below will fall in the “free tier” for one year.
You’ll need an SSH client in order to access your remote server’s command console. For Windows, we recommend PuTTY as an easy to use, light-weight SSH client. While downloading PuTTY, also be sure to get its companion utility, PuTTYgen (from the same webpage).
You’ll need an FTP client in order to transfer files (your Arches app customizations) to your server. We recommend FileZilla.
Once you have an AWS account set up, and PuTTY/PuTTYgen and FileZilla installed on your local computer, you are ready to begin.
Note
Experience with command line tools, especially those that involve the management of security (encryption) certificates (such as ssh and scp) is typically necessary to deploy and manage Arches on remote cloud computing services.
From your AWS account console, navigate to the EC2 section. You should get to a screen that looks something like this:
A (dated) view of the EC2 dashboard (AWS dashboard interfaces frequently change)#
Click on “Launch Instance”
You now have the opportunity to customize your instance before you launch it, and you should see seven steps listed across the top of the page. For our purposes, we only need to worry about a few of them:
In Step 1, choose “Ubuntu Server 22.04 LTS” as your operating system
In Step 2, choose an instance type
In Step 3, tag your server with a name (this is helpful, though not necessary)
In Step 4, you’ll need to:
Select “Create a new security group”
Name it “arches-security”
Modify the rules of this security group to match the following
Type
Protocol
Port Range
Source
HTTP
TCP
80
Anywhere (0.0.0.0/0)
HTTPS
TCP
443
Anywhere (0.0.0.0/0)
Custom TCP Rule
TCP
8000
Anywhere (0.0.0.0/0)
SSH
TCP
22
My IP
In Step 5, click Launch
When you launch the instance, you will be asked to create a new key pair. This is very important. Name it something like “arches-keypair”, and download it to an easy-to-access location on your computer. You will use this later to give the SSH and FTP clients access to your server. Do not misplace this file.
Once you have launched the instance, click “View Instances” to see your running (and stopped) EC2 instances. The initialization process takes a few moments, so we can leave AWS alone for now and head to the next step.
NOTE Your Security Group is the firewall for your server. Each rule describes a specific type of access to the server, through a specific port, from a specific IP address. Never allow access through port 22 to any IP but your own. If you need to access your server from a new location (library, university) you’ll need to update the SSH security rule with your new IP address.
Convert your AWS .pem Key Pair to a .ppk Key Pair#
PuTTY uses key files in a different format than AWS distributes by default, so you’ll have to make a quick conversion:
Open PuTTYgen
Click Load
Find the .pem file that you downloaded when launching your instance (you may have to switch to “All Files (.)”)
Once loaded, click Save
Ignore the prompt for a passphrase, and save it with the same name as your original .pem file, now with the .ppk extension.
Now go back to AWS, and look at the status of your server instance. By now, it probably says “2/2 checks passed” in the Status Checks column, and you should have an address (xx.xx.xx.xx) listed in the Public IP column. It’s ready!
Open PuTTY, and enter your server’s Public IP into the Host Name bar. Make sure Port = 22, and the Connection Type is SSH (remember the security rules we were working with?).
To make PuTTY aware of your key file, expand the SSH section in the left pane, and click on Auth. Enter your .ppk file as the “Private key file for authentication”.
Once you have the IP Address and key file in place, click Open.
Click OK to trust the certificate, and login to your server as the AWS default user ubuntu.
If everything goes well you should be greeted with a screen like this:
Congratulations! You’ve successfully navigated your way into a functional AWS EC2 instance.
Now that you have a command line in front of you, the next few steps should be very familiar. Luckily, if you are coming from Windows, you’ll find that installing dependencies on Ubuntu is much, much easier. Do all of the following from within the /home/ubuntu directory (which shows as ~ in the command prompt).
Download the install script for dependencies (this links to the v7.5.0 dependencies install, update the link for your specific version)
There’s no hard and fast rule about where in the filesystem you should install Arches on an EC2 instance. A typical deployment scenario would be to install Arches on an Ubuntu EC2 instance. For that kind of instance, Amazon will provide SSH credentials to log in as the ubuntu user (with super user privileges). That means when you login via SSH to your Arches EC2 instance, you should find yourself in the /home/ubuntu directory.
For sake of simplicity and consistency, we’ll assume you will be installing Arches within the /home/ubuntu directory. However, you may choose an alternate location like a sub-folder of the /opt directory. The /opt directory may be more convenient if you want to make Arches easier to manage by multiple people with different user accounts.
Once you have the dependencies installed, see Installing Core Arches, you can copy your Arches project from your local machine to the desired location on your Arches EC2 instance. You can use Filezilla to do that, or use the command-line utility scp.
To transfer files from your local environment to your EC2 instance, you’ll need to use an FTP client. In this case we’ll use FileZilla.
First, we’ll need to set up the authentication system to be aware of our AWS key file.
Open FileZilla
Go to Edit > Settings > SFTP and click Add key file…
Navigate to your .ppk file, and open it. You’ll now see you file listed.
Click OK to close the Settings
Next, you can use the “Quickconnect” bar:
Host = your server’s Public IP
Username = ubuntu
Password = <leave blank> (that’s what the .ppk file is for)
Port = 22
Once connected, you’ll see your server’s file system on the right side, and your local file system on the left. Find your local “my_hip_app” directory, and copy the entire directory to /home/ubuntu/Projects/. This example directory structure is consistent with related documentation explaining how to set up Apache or Nginx for use with Arches, see: Serving Arches with Apache or Nginx
Next, use this command to remove the elasticsearch installation from your new app on the server (because ElasticSearch should be installed on on your EC2 instance).
In this case, explicitly setting the host:port with 0:8000 ensures that the server is visible to us when we try to view it remotely.
You should now be able to open any web browser and view your app by visiting your IP address like so: http://xx.xx.xx.xx:8000. Now that you have transferred your app to a remote server, its time to use a real production-capable webserver like Apache to serve it (that’s how we can get rid of the :8000 at the end of the url). If you can’t see Arches, check AWS networking permissions to make sure port 8000 is accessible. But once you’ve verified Arches is working, DO NOT leave port 8000 open. Leaving it open will be a security risk.
Another way to check would be SSH onto your Arches EC2 instance and use curl to see if Arches is responding.
curlhttp://localhost:8000
If the above command gives you raw HTML, then Arches is functioning and responding to requests to port 8000.
Keep in mind that you may need to have different values in your settings.py file once you have transferred it to a new operating system (GDAL_PATH, for example). To handle this, create and use a different settings_local.py file on each installation.
As noted above AWS security management can complex. It is best to consult with experts in AWS to get advice about your specific deployment scenario. Generally speaking, when implementing a “Multiple Node Deployment” (see above) architecture, you should set up a unique (and clearly named) security group for each EC2 instance and the RDS instance involved in your deployment. You can then set the minimum required “inbound” rules that allow members of each of these security groups to connect as needed. For example, an EC2 instance running ElasticSearch would have its own security group. That ElasticSearch security group would have an inbound rule that allows connections from the Arches EC2 instance security group at the desired port (the default port for client, like Arches, API calls connecting with ElasticSearch is 9200).
Some additional (advanced) considerations include:
RDS installation of PostGIS (geo-spatial) extensions: If you use RDS for serving an Arches database, you may want to review official documentation on how to add the required PostGIS extensions.
Arches Allowed Hosts: In settings.py (sometimes set via settings_local.py) you will need to add multiple items to the list of ALLOWED_HOSTS. Consider the following example:
In that example, “my-arches-site.org” is the public domain name. But the items “ip-10-xxx-x-x.eu-west-2.compute.internal”, “10.xxx.x.x”, and “ip-10-xxx-x-x” are all AWS internal network addresses for the EC2 instance where Arches is deployed. You may need all of these for Arches to work properly.
Arches CSRF Trusted Origins: Django 4.0, a dependency of Arches 7.5 introduced a new setting for security purposes. In the settings.py (sometimes set via settings_local.py) you will need to add multiple items to the list of CSRF_TRUSTED_ORIGINS. If you don’t include this, users will encounter CSRF error (403) then they attempt to login. See the Django documentation for details. Note the following items (with the https:// prefix):
Once you’ve verified that you have properly installed Arches and its dependencies on your EC2 instance, it’s time to configure Arches to work with either Apache or Nginx web servers. Apache or alternatively Nginx play an important role in security and performance. Configuring Apache or Nginx is a necessary aspect of deploying Arches in production. Please review Serving Arches with Apache or Nginx to learn more about production deployment of Arches.