Technical TA Guide

This guide is designed to give an overview of many of the systems that are core to The Data Mine (TDM).

This guide can be leveraged by anyone, but many of the topics are directed toward technical TA’s.

Technical TA Responsibilities

Technical TA’s help to guide student teams through questions that come up during the research project.

A few core areas of focus for the technical TA position are included below:

  • Assisting when setting up technical environments.

  • Guide students through building and testing different research hypothesis.

  • Help to troubleshoot technical questions.

  • Escalate questions to The Data Mine team when additional assistance is required.

Core Technologies

ACCESS

The ACCESS platform is the first stop for new people in The Data Mine.

Any users who would like to log-in to Anvil (students and mentors) will need to setup an ACCESS ID.

When creating an ACCESS ID, users will sometimes have an error with the ID creation. When that happens, students can submit a ticket to ACCESS Support to request assistance.

The Data Mine can help to submit tickets as well.

If you need assistance with an ACCESS ticket, submit the issue information and a screenshot of the error to datamine-help@purdue.edu and the data science team will work with you to submit a ticket to ACCESS.

After the ACCESS ID is setup, the steps below need to be completed:

  1. The ACCESS ID needs to be added to The Data Mine’s allocation.

  2. After the ID is added to the allocation, the ID also needs to be added to the security group for the team project directory on Anvil.

    • This should happen automatically within 24-hours. If there is an issue, submit a ticket to datamine-help@purdue.edu with the information below:

      • User’s ACCESS ID

      • Student team name

Anvil

Anvil is a high performance computing (HPC) cluster at Purdue University. The system is maintained by the Rosen Center for Advanced Computing.

Once the ACCESS setup above is complete, users can log-in to ondemand.anvil.rcac.purdue.edu. The OnDemand Anvil platform provides the users access to common applications like Jupyter Lab, VSCode, R-Studio, and SAS.

A few important notes about the Anvil platform:

  • Be sure to use the applications under The Data Mine section of OnDemand.

    • These are setup by The Data Mine for users.

  • When running code in an environment the seminar kernel is most commonly used.

    • This kernel has many coding packages preloaded.

  • By default, The Data Mine’s apps limit the number of resources that can be requested for a session.

  • Similarly, The Data Mine does have GPU resources, but these require approval.

  • When users first connect, they may not see their project directory.

    • The helpful tips section shows how to setup a link to the project directory.

  • Users aren’t able to install their own packages. When a package is needed, there are two options:

    1. Contact datamine-help@purdue.edu and we’ll add the package for you.

    2. Setup a custom environment on Anvil.

Custom environments allow students to install packages at will.

However, if they break, they are not supported by The Data Mine.

This means that if your team is working in a custom environment and it stops working, The Data Mine won’t be able to help.

GitHub

GitHub is installed by default on Anvil. It is strongly encouraged that teams use GitHub for version control and documentation.

Each student team will have a GitHub repository setup by default.

If you aren’t sure of the repo’s name, email datamine-help@purdue.edu.

The Data Mine has a GitHub on Anvil guide that helps team’s with their initial repo setup and interaction.

It’s helpful to review the GitHub guide above and test it out before your team starts using GitHub.

This way you can troubleshoot any issues and help lead the team through their setup.

It’s also common for teams to see permission issues when first using GitHub.

If anyone has a permissions issue, send their GitHub ID to datamine-help@purdue.edu. The team will add them to The Data Mine’s GitHub organization and the permissions group for the repo.

If the user has trouble finding the GitHub invite, check: github.com/orgs/TheDataMine/invitation

Windows Servers

Specific applications, like Power BI, Tableau, or ArcGIS Pro, may require a Windows Server.

If your team needs a Windows environment, the first step is to email datamine-help@purdue.edu.

When submitting a ticket, be sure to include:

  • Your team’s name

  • The email of each student that will need access to the server

Once the server is ready, The Data Mine team will walk you through the Windows server connection process.

Team Research

One of the most important aspects of The Data Mine is that it gives teams a great chance to build and test hypothesis with very low consequence.

As part of this, the technical TA should be a core driver of a team’s research philosophy.

When a team is researching a new technique, or stuck on a problem, think through:

  • What is being done in industry?

  • Publications that may show how similar problems were solved.

  • Can the problem be broken down into smaller parts?

  • Are there any subject matter experts at Purdue or the mentor’s company who could help?

  • Would a team brainstorming session help to find potential solutions?

The Data Mine will always be here to help, but one of the most important things you can take away from these projects are the abilities to think critically, come up with solutions, and then test those solutions to see what works.

Experiential learning projects are a great time to build these skills. Because the projects are focused on the team learning and growing together.

When submitting a ticket to The Data Mine team, we’ll want to know:

  • What the problem is.

    • Code examples are always amazing.

  • What research was done to try to fix the problem?

  • What were the outcomes of those attempted fixes?

  • Do you have any theories on what may be causing the problem?

It’s always OK to ask for help, but we want to understand what steps you took to try and solve the problem as a team before you escalated to us.

Documentation

Documentation is one of the most impactful and least popular tasks for a team. Many of The Data Mine’s projects continue for multiple years but have new students.

That means that if teams don’t do a good job with documentation, a team may spend their first semester (or more) working through what was done previously.

TA’s should help the teams continually build their documentation. This can be done through a GitHub README.

It’s often a good idea to hand over your documentation to someone who isn’t directly in the project (mentor or mentor’s colleague) and see if they can follow the steps.

Treat documentation like any other work task. Make them deliverables and review them as a team.

The more practice a person gets, the easier documentation gets.