Allow creating a record on Dataverse directly from GitHub

At the University where I work, we are receiving more and more requests from researchers who would like to automatically create a Dataverse record based on the state of their GitHub repository. Currently, the dataverse-uploaded GitHub action requires the researcher to:

1. Create a data record on Dataverse and fill in the metadata manually
2. Copy the DOI from Dataverse and add it to the `workflow.yml` file
3. Run the GitHub workflow
4. Go back to Dataverse and submit for review

This workflow makes the researcher go back and forth between GitHub and Dataverse many times. I would like to propose the addition of a feature to create a data record upon running the GitHub action for the first time. This would reduce the number of times the researcher needs to switch between Dataverse and GitHub, and it might also help the researcher by automatically filling in as much metadata as possible based on the information in the GitHub repository.

## Proposed user interface

The `DATAVERSE_DATASET_DOI` field could be defined to be optional (not required). 
- If the DOI is provided, the action works as it currently does.
- If the DOI is **_not_** provided, a new Dataverse record is created using as much information as possible from the GitHub repository to define the metadata, and then the action proceeds as always.

## Proposed implementation

- The "[Create a Dataset in a Dataverse Collection](https://guides.dataverse.org/en/latest/api/native-api.html#id50)" can be used to implement the main feature.
- [GitHub Contexts](https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/accessing-contextual-information-about-workflow-runs#about-contexts) can be used to automatically populate as much metadata as possible for the newly created record on Dataverse.

For example, the action could follow these steps:

1. Create and populate a temporary "metadata.json" file based on the GitHub context (which is needed to create a record with given metadata)
2. Make the API request to create a new record based on a given "metadata.json" file. Something like:

```python
import os
import requests

headers = {
    'X-Dataverse-key': os.getenv('API_TOKEN', ''),
    'Content-type': 'application/json',
}

with open('metadata.json', 'rb') as f:
    data = f.read()

response = requests.post(
    'http://' + os.getenv('SERVER_URL', '') + '/api/dataverses/' + os.getenv('PARENT', '') + '/datasets',
    headers=headers,
    data=data,
)
```

3. Extract the DOI of the newly generated Dataverse record from the response object and use it as it had been provided by the user in the `DATAVERSE_DATASET_DOI` field.
5. Maybe it would even be possible to have the workflow replace the missing value of the `DATAVERSE_DATASET_DOI` field with the newly created DOI. This would ensure that no new Dataverse record is created if the action is rerun.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow creating a record on Dataverse directly from GitHub #24

Proposed user interface

Proposed implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow creating a record on Dataverse directly from GitHub #24

Description

Proposed user interface

Proposed implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions