4TU.ResearchData Hackathon

Challenges

Challenge 1 - API Documentation

This one is our work in progress.

Contribute to the djehuty documentation by adding info about the (v2) API.

In this challenge you will expand Section 5 of the documentation. This section already provides information about some of the endpoints, but it is still under construction.

Where to start?

In the djehuty folder, find file

src/djehuty/web/wsgi.py

In the file, you can find a procedure url_map belonging to the class ApiServer. This procedure serves as a dictionary between URIs and procedures. Look for sections V2 API. Confront this section with the documentation and add sections that are missing, following the documentation convention.

To change the documentation files, go to

doc/api.tex

Challenge 2 - Propose new stats visualisation

This one is for those who feel creative!

Propose visualizations for datasets statistics presented on the landing page. You can also use other statistics than those displayed now.

And a bonus question: Which statistics would you like to have available via djehuty API call?

Where to start?

Start by creating the dummy data. Upload some files and publish them.

Then explore how the present statistics is created in:

src/djehuty/web/resources/html_templates/portal.html

Look how the data is retrieved using the procedure: repository_statistics in the djehuty/src/djehuty/web/database.py. Inspect the SPARQL templates used and adapt the queries to get the statistics you need.

Once you have the data, start visualising!

Challenge 3 - GitHub/Gitlab Release integration

This one might sound familiar!

The challenge is based on an issue raised after last meeting between the DCC and 4TU.ResearchData technical team.

Work on an automated workflow to connect code hosting platform (GitHub/GitLab/BitBucket/Savannah/Forgejo) releases with 4TU.ResearchData.

Where to start?

There is already some good discussion in the issue, so that can be a starting point.

Explore the GitHub documentation and find out how webhooks can be used to achieve our goal.

Challenge 4 - (Advanced) Git big repo results in big memory consumption

This one was bugging us for a while.

When you git clone (relatively) big files from 4TU.ResearchData, djehuty process memory usage exceeds 4GB which can cause a system crash.

You can help us solve this puzzle.

Where to start?

The process that is causing the issue is probably subprocess.run in __git_passthrough procedure. You will find it in src/djehuty/web/wsgi.py.