Azure Third Party Data Sharing

Problem-solving for sharing the data. (Get rid of the old way of SFTP work)

As we know data is an important factor for the company. The data might be structured data or unstructured data. Based on the data the analysis can be done and business Decisions can be made.

Data sharing for the organization with other 3rd parties is a very important factor where they can share the data with their supplier or analytics team. The organization is worried about how the data can be shared by not allowing the 3rd parties to give access to their enterprise systems.

There might be a scenario where you don’t want to allow the External 3rd parties to access our enterprise systems. So we found one way to share data with 3rd parties.

In the above diagram, the data is generated on one of the VM in the Azure cloud. We have to set up the python SDK for azure which helps to push the file on Azure Storage.

Link: https://azure-storage.readthedocs.io/

Once the environment is set up you are ready to push the file from the local path to Azure Blob Storage Container. The upload jobs can be scheduled via crontab or Jenkins. Same we can schedule the download jobs as well which will bring the data back to the local path uploaded by the 3rd parties.

Inside the Azure Storage account :

In storage account, you can create separate blob containers that will be private so no one can access it from outside apart from the authorized 3rd parties.

Security:

We need to set up the .gpg encryption and decryption mechanism where we will be importing the .gpg public key from 3rd parties and encrypt the files their public key. Once the data is uploaded to containers

The 3rd parties will download it and decrypt it with their secret key.

HFile is that the actual storage file that stores the rows as sorted key values on a disk.

Access:

Once the container is created the access to the 3rd parties is given via SAS-TOKEN which is the signed URL access to the container. The files can be downloaded by putting the SAS-token value in any python code to pull the data from container or storage explore which is a user-friendly tool.