IAM access, IAM security, IAM confusing at times

An overview of IAM fundamentals on GCP

ยท

6 min read

Introduction ๐Ÿ‘‹

It is vital to understand IAM if you use Google Cloud or any cloud provider. IAM is a service for creating and managing access to GCP resources in a single system. Incorrect usage is a risk and can lead to security problems in the future. This post talks about the basics of IAM and what users should know when using the cloud.

IAM ๐ŸŽŸ๏ธ

IAM is Identity Access Management and contains three main components - describes an identity that has access to specific resources and how they can interact with it. This can be shortened to who, what, and how. Each resource on GCP has permissions containing the who, what and how. These permissions are grouped into roles that can be assigned to users. IAM enforces that all API calls have the appropriate permissions to use the resource.

Generally speaking the best practice for IAM is:

  • Grant people the least privilege needed
  • Handle service accounts and their keys with care
  • Audit by monitoring logs
  • Ensure policies adhere to company requirements

Access is assigned to an identity, and this can come in a few forms on Google Cloud including:

  • Google Account - a person (e.g. Fumi)
  • Service account - an account for an application (e.g. Web app on compute engine)
  • Google group - a group of people and/or service accounts (e.g. Data Analysts group)
  • Google Workspace domain - a group of Google Accounts created in an organisation's Google Workspace account (e.g. If Flingmycow was a name of a company, then everyone with an email *@flingmycow.com)
  • Cloud Identity domain - a group of Google Accounts in an organisation, but not necessarily linked to Google Workspace (e.g. A company that has Office 365 and uses GCP for their compute workloads)
  • All authenticated users - an identifier used to allow any Google Account on the internet to access
  • All users - an identifier used to ally anyone on the internet, authenticated or not, to access a resource. Some resources do not support this type of access.

How could this work? If you have a team of data analysts who need read only access to data you should create a Google Group that contains all required emails and give it the BigQuery Data Viewer role. It's recommended to use Google Groups for access as it's easier to manage than individual users.

Resources are objects or instances a user can be granted access to use. This ranges from GCP projects to virtual machines to buckets in Google Cloud Storage.

It's worth noting that permissions applied at higher levels than a resource, such as Organisation or Folder, the permissions flow down to lower-level resources. The effective policy for a resource is the union of the policy set at that resource and policies inherited from higher-level resources.

Each resource has a set of permissions that can be granted to an identity. The permissions are in the format service.resource.verb, so to have the ability to create datasets, one needs the permission bigquery.datasets.create - these permissions usually map to a single REST API method.

It would be time-consuming to assign individual permissions to a user. Therefore, it's best practice to grant roles instead, which is a collection of permissions that makes sense for a user to have. For example, the BigQuery Admin role has all BigQuery service permissions, whereas the BigQuery Data Viewer role has a subset of permissions related to getting and listing data in tables.

It's also possible to grant access to primitive roles Owner, Editor, and Viewer. These grant permissions across all Google Cloud resources, which is easier to use but bring higher risk. It's recommended to use predefined roles instead where possible. Additionally, it's possible to create custom roles if you have requirements that can't be fulfilled with predefined roles.

There are a few ways of interacting with Google Cloud that doesn't use the console.

If you use either the Google Cloud SDK or any of the client libraries you will need to authenticate with one of the following methods.

Google recommends using default service accounts and Google Cloud Client Libraries in your applications. Here's why:

  1. If you have an application in Compute Engine, Google Kubernetes Engine, App Engine, Cloud Run or Cloud Functions using the default service account then it can retrieve the credentials without a JSON file. This is more convenient and secure than manually passing credentials
  2. GC Client Libraries uses Application Default Credentials (ADC) to automatically find your service account credentials. It looks for credentials either from the environment variable or uses credentials associated with a default SA if possible.

If you cannot use a default service account then you need to download a service account JSON file. To set the JSON file as an environment variable use any of the following methods:

# Command Line
export GOOGLE_APPLICATION_CREDENTIALS = "/path/to/keyfile.json"
# Python
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] ="/path/to/keyfile.json"
# Python with a client library function
from google.cloud import storage
storage_client = storage.Client.from_service_account_json('service_account.json')

When creating instances of Compute Engine virtual machines you may notice under the configuration Access Scopes. These are the legacy methods for specifying permissions for instances. They define default OAuth scopes used in requests from gcloud and client libraries. These access scopes must be configured in order to run the instance as a service account.

The best practice is to set the full cloud-platform in the access scope, then securely limit the service account's API access with IAM roles. Access scopes apply on a per-instance basis.

If you're using the gcloud CLI tool and developing locally, using gcloud auth login will obtain credentials for your user account via a web-based authorization flow. It sets the account as active in the configuration.

To use Google Cloud Client libraries then the following command will generate credentials and store them ready to be used by the client libraries.

gcloud auth application-default login
  • For Mac and Linux the credentials are stored at ~/.config/gcloud/application_default_credentials.json
  • Windows users can run the command gcloud info --format='value(config.paths.global_config_dir)' to obtain the path. Mine is at C:\Users\flingmycow\AppData\Roaming\gcloud\application_default_credentials.js

You can also see all configurations set up by running gcloud config configurations listas you may need to connect to multiple projects.

Never ever publicly share your JSON keys or upload them to a repository like GitHub. Your account and project(s) are at risk of compromise, which could lead to unwanted bitcoin mining and a huge bill. Worst case scenario is your data and code is held ransom and business operations can be shut down.

If you require explicit credentials then it's possible to pass a Credentials object when instantiating a Client object.

client = Client(credentials=credentials)

Instructions for this approach is outlined here

Thanks for reading and I hope this helps clarify the basics of IAM on Google Cloud!