Skip to content

Anand's Data Stories

Learn. Share. Repeat.

  • AWS
  • Coding
  • DataScience
  • Oracle

Category: AWSDataWrangler

Rename Glue Tables using AWS Data Wrangler

Posted November 1, 2020November 1, 2020 Anand1 CommentPosted in AWS, AWSDataWrangler

I had a use case of renaming over 50 tables, adding “prod_” prefix to the existing Glue tables. AWS Athena does not support native Hive DDL “ALTER TABLE table_name RENAME TO” command. So one of the option was to – “Generate Create Table DDL” in AWS Athena. Modify the table name. Execute the DDL. Preview the […]

Transform AWS CloudTrail data using AWS Data Wrangler

Posted September 17, 2020September 24, 2020 AnandLeave a commentPosted in AWS, AWSDataWrangler

AWS CloudTrail service captures actions taken by an IAM user, IAM role, APIs, SDKs and other AWS services. By default, AWS CloudTrail is enabled in your AWS account. You can create “trail” to record ongoing events which will be delivered in JSON format to an Amazon S3 Bucket of your choice. You can configure the […]

Reading Parquet files with AWS Lambda

Posted April 14, 2020September 24, 2020 AnandLeave a commentPosted in AWS, AWSDataWrangler

I had a use case to read data (few columns) from parquet file stored in S3, and write to DynamoDB table, every time a file was uploaded. Thinking to use AWS Lambda, I was looking at options of how to read parquet files within lambda until I stumbled upon AWS Data Wrangler. From the document […]

Proudly powered by WordPress | Theme: Sydney by aThemes.