Artifactory Archives | JFrog

Proudly Announcing JFrog’s Full Conformance to OCI v1.1

zoer — Tue, 24 Sep 2024 15:19:27 +0000

JFrog has long supported standards widely used by developers, including OCI container images. We started with our OCI-compliant Docker registry, then followed up with dedicated JFrog Artifactory OCI repositories. In our continued commitment to developer freedom of choice, we’re excited to take another leap forward.

JFrog is now fully conformant to OCI v1.1. Source: OCI Conformance Page

JFrog is now fully certified to the OCI v1.1 standard. We are proud to be one of only two vendors who have made this commitment thus far. Available from JFrog Artifactory V7.90.1 for OCI and Helm OCI repositories, JFrog users can now get everything they need for their OCI packages.

What is OCI v1.1?

The Open Container Initiative (OCI) is a Linux Foundation project providing open standards for container formats and runtimes, and is widely adopted by developers. It seeks to optimize industry-wide interoperability while maintaining performance.

Originally announced in July 2023, OCI v1.1 is OCI’s latest specification for image, runtime, and distribution specifications. It offers developers more flexibility and integrity, and the ability to link images with one another (using the subject field). Additionally, the new artifactType field makes it easy to create tags to parse through artifacts.

Referrers API

One of the most interesting features in OCI v1.1 is the Referrers API, which offers a convenient approach to retrieve and filter the relationships between images using subject and artifactType. With Referrers API, and by leveraging JFrog Artifactory, you can transfer these relationships between different repositories with ease.

How Referrers API Works

The best way to illustrate the power of Referrers API is with examples.

Example 1:
Fetching All Subjects

Let’s say we have an image called Image A. Two other images, Image B and Image C, are related to Image A. To define these relationships, we can use the subject field to point both Image B and Image C to Image A.

With Referrers API, you’ll retrieve that Image B and Image C are related to Image A.

Example 2:
Fetching Specific Artifact Types

Now, let’s say you want to retrieve only specific artifact types. Let’s assume Image B is an SBOM artifact, and Image C is a signature artifact. Using the artifactType field, you can denote these references.

Using Referrers API, you can retrieve signature artifacts related to Image A, and Referrers API will correctly state that Image C as a signature related to Image A.

As you can see from these two examples, Referrers API in OCI v1.1 makes it easy to access valuable information that can be used to show the relationships of associated OCI packages.

Additional details on how to get started are available here >

OCI Containers and Repositories in JFrog Artifactory

JFrog’s support of OCI means you get seamless access to everything OCI within Artifactory. You can use OCI natively, securely, and reliably, all within a single point of truth.

For more information on how to get started, including sample code snippets, visit our Referrers API page on the JFrog Help Center. To see how JFrog allows you to do more with your OCI repos, take a tour of our platform or speak to a JFrog team member.

OpenTofu support comes to JFrog Artifactory

zoer — Tue, 23 Apr 2024 16:31:57 +0000

If you deploy container-based services in Kubernetes, chances are you’re also using infrastructure-as-code to help automate the provisioning and maintenance of the cloud environments where your applications will run. Up until recently, Terraform was “the name” in infrastructure-as-code. However, HashiCorp’s decision in the second half of 2023 to change Terraform from an open source license to a proprietary license sent many developers looking for alternatives, with OpenTofu garnering many supporters.

To provide for the OpenTofu community, we’ve expanded our Terraform repositories to work with the OpenTofu client and registry.

Before we touch on that, let’s do a quick recap on OpenTofu and its relation to Terraform.

What is OpenTofu?

OpenTofu is an open-source infrastructure-as-code tool. It’s a Terraform fork which was created as an initiative of multiple organizations and contributors in response to HashiCorp’s switch of Terraform from an open-source license to the BUSL.

OpenTofu is also a Linux Foundation project, which means it’ll remain free and open for all to use. In January 2024, the Linux Foundation announced that OpenTofu was generally available, meaning that it is now a production-ready project.

NOTE: We acknowledge the situation surrounding the removed block functionality in OpenTofu, which some claim was copied from Terraform. Artifactory’s support of OpenTofu is not indicative of a stance on this issue, but solely to better serve our customers and development community.

JFrog’s support of OpenTofu

As of version 7.81.0, Artifactory supports caching artifacts from the OpenTofu provider registry using the OpenTofu Client.

Because of the similarities and shared properties of OpenTofu and Terraform, organizations can leverage JFrog’s Terraform repositories for OpenTofu. All that’s required is toggling one setting in the “Set Me Up” and you’re good to go.

The OpenTofu client also works with Artifactory’s Terraform BE repositories, allowing you to manage your state files in Artifactory as well.

Log into your JFrog account to start managing your OpenTofu files with JFrog. Not an Artifactory user, but want to see how it can help manage your OpenTofu files? Start a free 14-day trial.

Four Key Lessons for ML Model Security & Management

drewt — Wed, 21 Feb 2024 20:35:13 +0000

With Gartner estimating that over 90% of newly created business software applications will contain ML models or services by 2027, it is evident that the open source ML revolution is well underway. By adopting the right MLOps processes and leveraging the lessons learned from the DevOps revolution, organizations can navigate the open source and proprietary ML landscape with confidence. Platforms like JFrog, that include ML model management capabilities can further support organizations in their journey towards successful adoption.

Since the first open source package from the GNU Project was released by Richard Stallman in 1983, there has been a huge evolution of software reproducibility, reliability, and robustness. Concepts such as Software Development Life Cycle Management (SDLC), Software Supply Chain (SSC), and Release Lifecycle Management (RLM) have become a cornerstone of how to manage and secure software development environments.

In terms of MLOps, here are four lessons covering topics specifically related to AI Models including:

Traceable versioning schemas
Artifact caching and availability
Model and dataset licensing
Finding trusted open source ML repositories

These lessons for managing and securing ML development environments, are a must-learn for AI developers and MLOps professionals.

As enterprises increasingly embrace machine learning models and services, it becomes crucial to leverage open source packages while simultaneously ensuring security and compliance. Open source models and datasets offer numerous benefits, but they also come with their own set of challenges and risks. Here we will explore some key lessons learned from the DevOps revolution and see how they can be applied to ensure successful adoption of open source ML models.

Lesson 1 – Adopt a clear, traceable versioning schema

Versioning allows an organization to be sure the software they create is using the right parts. With good versioning you can rollback bad deployments and necessitate fewer patches for live customers experiencing bugs in the application.

In the traditional world of software development, Semantic Versioning (SemVer) is the standard. Semantic Versioning is a very powerful tool, but can only reflect a single timeline. With Semantic Versioning you can identify the present and past, as well as the order between them.

When it comes to ML Model Versioning, however, the case is considerably different. While software builds with the same inputs should be consistent, with ML models two sequential training sessions can lead to totally different results. In ML model training, versioning schemas have many dimensions. Training might be done in parallel, using different parameters or data, but in the end all training results require validation. Your versioning schema should consist of enough metadata so that both Data Scientist, DevOps, ML Engineers and SRE will find it easy to understand the version’s content. While many ML tools use some form of Semantic Versioning, JFrog is taking a different approach to ML model versioning that better accommodates the complexity of ML model development and multiple stakeholders involved in the process.

Lesson 2 – Cache every artifact you use as they might disappear

Not all open source projects can be relied upon for the long term. Some might close down, while in other cases companies may stop supporting packages they created, meaning that the latest version might not work as well as the previous one.

To protect against this type of instability, when working with ML models, it is advised to cache everything that you use as part of training or inference. This includes: The model, the software packages, the container you run it in, the data, parameters, features, and more. Even the ML model itself is a piece of software, so it is wise to cache all those packages as well. There are various caching tools in the market, including JFrog Artifactory with ML model support, covering the most popular ML package types.

Lesson 3 – Model and dataset licensing procedures

Open source does not mean free! Most open source models have a license agreement that states what you can and can not do. Licensing is a very complex field and you might want to consult with a legal expert before selecting a model with a license that might put your company’s assets at risk. There are tools in the market to enforce licensing compliance, such as JFrog Curation and JFrog Xray that ensure your software licenses are compliant with company policy.

Lesson 4 – Use open source ML models from trusted sources only

When integrating open source into your software, you are de-facto putting your trust in the software creator to maintain the quality, security, and maintenance levels you need to ensure your software runs smoothly. Unfortunately, it is quite common to use an open source package, only to find out later on that there is a critical bug and the maintainer is not capable of solving it. As a last resort, you can use your own development resources to get into the code and start patching it – after all it is open source software – but in reality it is easier said than done and even worse, requires resources for maintaining the code going forwards.

Enterprises need to come up with a set of rules that determine if an open source package or model is mature enough to be used by their developers. JFrog’s best practice recommendations advise to at least look at the number of contributors and the date last release, as well other relevant information. The JFrog Platform can assist in this effort by automating policies to make your developers’ life easier and more productive.

Jump into the open source ML revolution with confidence

When it comes to ML models, versioning becomes more complex due to the multiple dimensions involved in training and validation. Caching every artifact used, becomes essential to mitigate the risks associated with the instability of open source projects.

It is also crucial to consider the quality, security, and maintenance levels provided by the software creator, taking into consideration critical bugs that may be detected down the line, requiring companies to allocate their own resources for maintenance.

By adopting lessons learned from the DevOps revolution and applying them to the open source ML landscape, MLOps professionals can better navigate the challenges and harness the benefits of ML models effectively, securely and efficiently.

Adopting the right MLOps processes today will set you up for success tomorrow. Check out JFrog’s ML model management capabilities and key industry partnerships, to see for yourself how they can support and improve your ML development operations by scheduling a demo or starting a free trial.

Improve Cloud Visibility with JFrog’s SaaS Log Streamer

drewt — Mon, 05 Feb 2024 19:51:24 +0000

Updated May 21st, 2024: JFrog’s SaaS Log Streamer is now generally available for customers on JFrog Cloud Enterprise+ subscriptions.

The beauty of deploying SaaS-based applications is that you don’t have to worry about building the infrastructure, hiring engineers to maintain it, staying on top of upgrades or worry about application security. Indeed, these are some of the main benefits you get by using a SaaS offering. However, the world of software is full of trade-offs, so, what do you lose out on?

One of the drawbacks of going the SaaS route, typically means that application logs are not readily available, making visibility into application metrics extremely difficult, if not impossible. All those logs that you relied upon when running applications in your own infrastructure are no longer accessible when using a SaaS offering.

Being in the dark about the applications you rely on is less than ideal, as you need certain information from the logs that are critical to your business. So how do you reap all the benefits of SaaS while maintaining access to important data from your core applications?

That’s where JFrog’s SaaS Log Streamer comes to the rescue, giving you the best of both worlds – maintenance free infrastructure along with great visibility into your applications. On top of that, you minimize the noise and increase efficiency by cutting out all the irrelevant logs to concentrate on what really matters.

JFrog Saas Log Streamer is now available for JFrog customers.

Get Started*

*Reference JFrog’s SaaS Log Streamer in the Requested Technology Integration field
*Available for JFrog Cloud Enterprise+ subscriptions

How JFrog’s SaaS Log Streamer delivers information to leading APM tools such as Datadog and Splunk

In the same way that JFrog SaaS instances are running on JFrog infrastructure, the same holds true for JFrog’s SaaS Log Streamer. You don’t need to install anything to get things going, as the log streamer retrieves all the necessary information from JFrog Artifactory instances, allowing only the relevant logs to pass through and eliminating the need for post-log filtering.

Out-Of-The-Box Log Streaming for JFrog SaaS Customers

By integrating the SaaS Log Streamer into the JFrog Platform capabilities, we continue to make our Cloud experience as complete and effective as possible, enabling organizations to accelerate their cloud migration efforts.

Currently the SaaS Log Streamer supports two log types:

1. Artifactory-Request Logs are a set of logs that help monitor Artifactory incoming requests. These logs allow you to track the trend of all requests based on HTTP status codes and request methods. This data can provide useful insights such as which artifacts are most requested and who the heavy users are of certain artifacts. Here are some sample screenshots from a typical Datadog APM dashboard:

Sample dashboards showing Artifactory Log Messages imported into DataDog (Click image for full-size)

2. Audit Trail Logs (access-audit, access-security-audit) register all operations related to users, groups and permissions, enabling auditing and tracking capabilities for enforcement of security policies in your organization. Here is a sample screenshot of these logs from a Datadog dashboard:

Example of Audit Trail Logs imported from JFrog Artifactory into DataDog (Click image for full-size)

These are just some examples of what your log dashboards may look like. You can leverage your APM to create your own dashboards and visualizations that meet your unique requirements by leveraging the data contained within the logs referenced above.

What’s Next for Cloud Log Streaming

In addition to Datadog and Splunk, coverage will be provided for other leading log vendors such as Dynatrace, Elastic, Dataset, and others. Support will also be provided for additional Artifactory log types, as well as those from other components of the JFrog Platform.

Get Started with Log Streaming

If you are a JFrog Enterprise+ SaaS customer using Datadog or Splunk as your log destination and would like to participate in the beta, please submit the form here and make sure to reference the SaaS Log Streamer.

After your submission is received our team will enable your instance and notify you to input your APM credentials into MyJFrog to have the logs flowing within an hour. Guidance on how to access and input the appropriate information into the MyJFrog portal is available here.

For more on this and additional logging functionality, schedule a demo with our solutions experts.

Integrating JFrog Artifactory with Amazon SageMaker

zoer — Wed, 17 Jan 2024 13:45:16 +0000

Today, we’re excited to announce a new integration with Amazon SageMaker! SageMaker helps companies build, train, and deploy machine learning (ML) models for any use case with fully managed infrastructure, tools, and workflows. By leveraging JFrog Artifactory and Amazon SageMaker together, ML models can be delivered alongside all other software development components in a modern DevSecOps workflow, making each model immutable, traceable, secure, and validated as it matures for release. In this blog post, we’ll dive into customer use cases, benefits, and how to get started using this new integration.

Table of Contents:

What is Artifactory?
What is SageMaker?
Why do I need Artifactory for my AI/ML development?
What are the advantages of the Artifactory and SageMaker combo?
What tools does JFrog provide for AI/ML model development?
Where does it make sense to integrate Artifactory with SageMaker?
- But first, secure the Amazon SageMaker environment!
Example workflows within Artifactory and SageMaker
Final thoughts and conclusion

What is Artifactory?

JFrog Artifactory is the single solution for housing and managing all the artifacts, binaries, packages, files, containers, and components for use throughout your software supply chain. It’s more than just inventory management — Artifactory provides sophisticated access control to stored resources, enterprise scalability with support for a mature high availability architecture, and a wealth of automation tools to support the software development lifecycle (SDLC). It’s just part of the JFrog Software Supply Chain Platform, a comprehensive toolset for DevSecOps that includes tools to ensure the safety, traceability, and efficient delivery of software components for development and production environments.

What is SageMaker?

Amazon SageMaker is a fully managed service that brings together a broad set of tools to enable high-performance, low-cost machine learning (ML) for any use case. With SageMaker, you can build, train, and deploy ML models at scale using tools like notebooks, debuggers, profilers, pipelines, MLOps, and more – all in one integrated development environment (IDE).

The environments used to support the artificial intelligence (AI) and machine learning (ML) lifecycles are varied based on elements such as an organization’s maturity, existing processes and related resources, as well as tool preference and familiarity in order to utilize existing skill sets. Often this translates to designing systems that are composed of multiple components to provide needed flexibility as well as meet the specific needs and requirements of developer, infrastructure, and security teams.

Given the focus of SageMaker on improving the development lifecycle of AI/ML, and the flexibility and maturity of Artifactory to protect and manage software binaries, integrating these two powerful solutions is a pragmatic approach for any AI/ML development.

Why do I need Artifactory for my AI/ML development?

Models are large, complex binaries consisting of multiple parts — and just as it is for other common software components and artifacts, Artifactory is an ideal place to host, manage, version, trace, and secure models.

It’s crucial to understand what goes into AI/ML development. Just like in other types of software development, third-party software packages are frequently used to accelerate development by avoiding writing a lot of code from scratch. These packages compose the beginning building blocks of many applications and frameworks. Third-party resources can come into development environments in various ways: from pre-installed packages or frameworks, through public repositories such as PyPI, Docker Hub, and Hugging Face, or from private repositories containing proprietary code or other artifacts that require authentication.

The inventory of dependencies that your project relies on can grow and change over time, and effectively managing them involves more than knowing the first level of dependencies that often come in the form of import statements in your code. What about the dependencies of these dependencies? What about the versions of all of these dependencies? Tracking these items is imperative to both the stability and reliability of software development projects — including AI/ML projects.

Consider the following scenarios:

The latest version of a public Python package you rely on has been imported into your project, but one of its dependencies has been modified in a way that’s incompatible with your current code, causing new development to come to a halt. Unfortunately, your developers have always just downloaded the latest version and the last working version of the offending dependency isn’t available from the original source. How will you recover?

A security vulnerability is discovered in the Docker image you regularly use to put your ML model into production. Unfortunately, the vulnerability was in an old version of a base image pulled from Docker Hub, and although it was a known CVE, it wasn’t discovered until it was exploited in your production system. Could this have been prevented?

It’s discovered that an ML model is licensed in a way that prohibits your organization’s current use. Fortunately, this was caught before it was released into production. Unfortunately, much of the code base relies on this particular model’s functionality. How much time will it take to find an alternative and potentially rewrite the related code?

Any one of these scenarios can throw a wrench in your production pipeline. Practical management of the inventory of all the artifacts used and produced during AI/ML development processes will help in recovery and reduce downtime. An enterprise-level artifact management system like Artifactory provides the foundation needed for effective troubleshooting, recovery, error prevention, and the ability to address more serious regulatory compliance and security issues around licensing and security vulnerabilities by utilizing other components of the JFrog Platform, such as JFrog Xray.

What are the advantages of the Artifactory and SageMaker combo?

SageMaker already provides a ton of resources for AI/ML development workflows. You can choose from available models and images that already include the most popular distributions, Python libraries, and ML frameworks and algorithms.

But what if you are in the business of building and producing custom and proprietary artifacts?

Built-in resources are generally intended to satisfy a majority of use cases, which usually means the most commonly used packages, libraries, and models are provided. Having these available is helpful for exploring and comparing feature sets, learning the basics of the platform or a specific framework, or building proof-of-concept type material. Over time, more specialized use cases might be addressed and more built-in resources will be provided, but this will likely be decided by popular demand over an individual organization’s immediate needs.

There will undoubtedly be cases where built-in resources can only provide a starting place, including but not limited to the following scenarios:

Proprietary Code or Models
Model developers may need to access proprietary resources for use in research, training, and testing.

Standardized Development Environments
An organization’s platform or infrastructure team may wish to standardize development environments to improve efficiency and to meet safety and security regulations required for production deployment.

Lean Production Artifacts
In some cases, it might be necessary to eliminate automatically supplied packages and frameworks that are superfluous and unused in the final product.

In all of these use cases, the ability to customize will become essential to your organization when developing production-grade custom models and the applications that use them.

What tools does JFrog provide for AI/ML model development?

JFrog is already a powerhouse in the context of artifact management and security. JFrog and AWS cooperate to provide toolsets that further the agenda to improve ‌quality, speed up iterative development, and provide safe and secure environments for AI/ML development workflows.

Just as artifact management systems are essential for software projects in other industries, proper artifact management addresses many of the same problems that exist in AI/ML model development. AI/ML development can greatly benefit from utilizing existing software development processes that improve quality by adhering to established DevOps best practices and by taking advantage of existing infrastructure and tooling that meets organizational regulatory compliance requirements.

JFrog Artifactory provides the foundation for managing software artifacts in a centralized location. The following are just some of the features of Artifactory and other commonly used tools that benefit AI/ML development:

Hugging Face Repositories: Most recently, and a huge contribution to the AI/ML ecosystem, JFrog introduced Hugging Face repositories to Artifactory — remote repositories to proxy models stored in the Hugging Face hub, and local repositories to store custom and proprietary models. We also recently announced new capabilities with ML Model Management general availability release, including ML model versioning.

Container Image Registry: Artifactory already provides a Container Image Registry to store images used for model training and deployment as well as proxy images from popular public registries such as Docker Hub.

Python Repositories: Python repositories in Artifactory are available to store proprietary Python packages or to proxy from public indexes such as PyPI.

NOTE: Artifactory provides repositories for managing 30+ other package types in addition to PyPI packages!

JFrog CLI: A robust command line tool to communicate with and operate an Artifactory instance and other components within the JFrog Platform (e.g. JFrog Xray).

All of these can (and should!) be leveraged in SageMaker environments.

Where does it make sense to integrate Artifactory with SageMaker?

There are many ways that Artifactory can be used with SageMaker, but the primary objective is to enable, within the SageMaker service, authorized access to an Artifactory instance at the appropriate times during development, allowing the successful retrieval of artifacts from Artifactory (including proxied requests for artifacts from remote repositories), and the successful storage of all artifacts needed for a project.

But first, secure the Amazon SageMaker environment!

The recommended approach for using any of the resources in SageMaker along with Artifactory is to operate SageMaker in VPC-only mode along with a SageMaker domain setup within the VPC. Learn more about getting started with SageMaker and SageMaker domains here, and about VPC-only setup here.

Example workflows within Artifactory and SageMaker

Beginning with an environment including a private VPC and a SageMaker domain with existing users, the following examples are potential workflows for training and deploying models using SageMaker services, and working within SageMaker notebooks and Notebook Instances. Artifactory fits nicely at each place in the workflow that requires either retrieval or storage of packages, dependencies, and other development artifacts.

Because training and deploying ML models are such integral tasks of ML model development, let’s start with workflows that utilize an existing development environment on a developer’s machine, but allows for the ability to launch SageMaker training jobs and the deployment of ML models to a SageMaker endpoint to make predictions, or inference.

Training an ML model

One of the key undertakings of a model developer is to train a particular model on a set of prepared data. The following workflow takes a developer from the creation of a container image required by the SageMaker service for the purpose of launching a SageMaker training job, to storing a trained model version in a private Hugging Face repository in Artifactory.

The first order of business is to create a Docker container image that will encapsulate a script to be used for training a model on a dataset from an AWS S3 bucket. The Sagemaker Python SDK and base images readily available on Docker Hub include popular machine learning frameworks and algorithms, and provide many of the required training utilities.

In this case, the work of writing the training script and building the Docker image is accomplished on a developer’s machine, which is configured to resolve artifacts from Artifactory. Another option for a continuous integration environment is to commit changes made to a Dockerfile and/or training script to a code repo that triggers a build of the Docker image on a build server, which then delivers the resulting image to Artifactory. Note that in either case, requests for the base image, Python packages, or other dependencies to be included in the Docker image are also resolved from Artifactory.

Once the training Docker image is built, the image is pushed to a Docker repository in Artifactory, to be accessed later by the SageMaker service.

The next step is to launch a SageMaker training job. This can also be done from a local development machine utilizing the SageMaker Python SDK. The training job is executed, and the resulting trained model version can then be uploaded to a local Hugging Face repository in Artifactory.

Deploying an ML model for Inference

Once a custom, trained model has been uploaded to Artifactory, this model can now be deployed to the SageMaker Inference service and tested using a SageMaker endpoint. The following workflow takes a developer from the creation of a container image required by the SageMaker Inference service to the deployment of a model to run predictions against.

Similar to the process of Training an ML model, begin with the creation of a custom Docker container that includes the necessary configuration and scripts for running predictions against a given model with SageMaker Inference. All packages and artifacts needed for this task are resolved from Artifactory and the resulting custom Docker image pushed to a Docker repository in Artifactory.

Now that the custom image is available in Artifactory, a Python script utilizing the SageMaker Python SDK can deploy the model to SageMaker Inference. All of the necessary configuration parameters required by SageMaker Inference as well as the location of the custom Docker image and the Hugging Face model in Artifactory can be provided using the SageMaker Python SDK Model initialization and methods.

Once the model is deployed and a SageMaker endpoint is active, it will be possible to run predictions utilizing the SageMaker Python SDK Predictor.

Working in Notebook Instances, SageMaker Studio Classic Notebooks, and other SageMaker resources

The sections on training and deploying ML models assumed a development environment separate from SageMaker, and the use of an IDE on an individual development machine. But what about using SageMaker tools like Studio Classic Notebooks and Notebook Instances?

One of the primary focuses of SageMaker is to provide full-featured environments to code, build, and run everything needed for AI/ML development. The primary advantage of this approach is improved efficiency for developers by avoiding context switching in and out of SageMaker. It’s also possible to customize these environments to provide only the software and tools approved for use by an individual organization to meet safety and security regulations.

Organizations that take advantage of and rely on the enterprise features of Artifactory (and the JFrog Platform) now benefit from the ability to resolve and retrieve artifacts to and from Artifactory via SageMaker Classic Notebooks and Notebook Instances.

One option is to standardize the use of Artifactory within SageMaker Classic Notebooks and Notebook instances by attaching a lifecycle configuration script. For example, consider using a lifecycle configuration script that accesses sensitive authentication and configuration information from an AWS secret, and performs the following actions with that information:

Install & Configure JFrog CLI
Configure pip for virtual PyPI repository in Artifactory
Configure Hugging Face SDK to work with Artifactory

As a result, each time a user launches a Notebook Instance, creates a SageMaker Classic Notebook, or launches a code console or image terminal, the setup needed for users to communicate with an instance of Artifactory is completed in a standardized and secure way. Without any additional effort, users will be able to resolve any Hugging Face Hub or Python PyPI software dependencies required for their workflow through Artifactory.

To see an example of a lifecycle configuration script and a step-by-step example of how to use a lifecycle configuration script with SageMaker Classic Notebooks and Notebook Instances, check out this Knowledge Base Article.

Final thoughts and conclusion

Bringing the enterprise-level management of JFrog Artifactory to Amazon SageMaker is recommended in all stages of the AI/ML lifecycle. The combination of JFrog and AWS helps to alleviate the complexity and cost concerns of iterative ML model development, testing, and deployment.

JFrog Artifactory is widely used to establish best DevOps practices for artifact storage and management. Ensuring that all development artifacts are sourced through JFrog Artifactory allows for more than just basic accounting and software inventory management — this practice will ensure an organization is set up for success when it comes to regulatory compliance, licensing, and safety concerns when using third-party software packages.

The initial setup and configuration of SageMaker and Artifactory, and the customization of the environment can be tricky if you’re unfamiliar with the administrative aspects of AWS VPC networking details, SageMaker domains, AWS IAM roles and permissions, or creating and building Docker images. For detailed documentation on how to incorporate Artifactory in your SageMaker environments, check out the JFrog Knowledge Base for a working demo along with detailed step-by-step guides and explanations.

Plus, join me and the team at AWS to learn more about this integration along with new ML capabilities from JFrog in our webinar, now available on-demand: Building for the future: DevSecOps in the era of AI/ML model development.

Expanding OCI support in JFrog Artifactory with dedicated OCI repos

zoer — Tue, 05 Dec 2023 18:57:54 +0000

Great news for developers who leverage containers — JFrog has expanded its support for the OCI Container standard with dedicated OCI repositories!

Before we touch on that, let’s do a quick recap on OCI containers for those unfamiliar with them outside the context of Docker.

What are OCI containers?

Containers are a lightweight and portable way to package and run software applications, along with their dependencies and configuration settings, in a consistent and isolated environment. The Open Container Initiative was founded in 2015 by Docker and other industry leaders to help provide industry standards around container image formats and runtimes. As the popularity of containers was exploding, it became clear that some industry standards were required in order to support interoperability between various container-related components and solutions.

OCI currently contains three specifications: the Runtime Specification (runtime-spec), the Image Specification (image-spec) and the Distribution Specification (distribution-spec). OCI is popular due to its vendor-neutral status and open governance structure. So while Docker might be the first tool that comes to mind when it comes to managing containers, the OCI specifications open the field to many other tools widely used by organizations big and small for building and running OCI compatible containers.

JFrog’s expanded OCI support: OCI Images and OCI Artifacts

JFrog has long been a premier solution for managing container images and already supports OCI images. Using our Docker repositories, developers have been able to upload and manage both Docker and OCI images.

As of version 7.74, Artifactory will support OCI v1.0 with its own dedicated repository type.

With the introduction of OCI repositories, we’ve made it easier to find and store OCI compliant artifacts rather than using the Docker repo type that was less intuitive. We’re also able to better accommodate the differences in the mediaType fields specific to each image format, support our customer’s workflows, and give OCI images and OCI artifacts the first-class treatment Artifactory users love.

Organizations benefit from support of multiple tools that support the OCI specification, including Podman, ORAS, BuildX, Buildkit/builtctl, and can even make use of projects like WASM to OCI to store WebAssembly modules as OCI Artifacts. Organizations can also upload any artifact that conforms to the OCI specification to Artifactory – anything that can be packaged as a tar or tar.gz.

At this point in the blog, some OCI fans are probably wondering, “What about OCI 1.1???”. OCI Version 1.1.0 is still in the Release Candidate stage and not fully approved (see OCI 1.1.0-rc5).

For more information on JFrog Artifactory’s OCI support, check out the documentation here.

Helm OCI support in JFrog Artifactory

zoer — Wed, 06 Dec 2023 10:53:12 +0000

If you’re building apps to run on Kubernetes, chances are you’re using Helm. If you fall into that category, we have good news for you: Helm users will now benefit from JFrog Artifactory’s support of Helm OCI registries in JFrog Artifactory.

JFrog’s expanded Helm OCI support

Since the release of Helm v3.8.0, the Helm client has been wrapping Charts with OCI by default. This is primarily to help address performance issues with large Helm repositories, which used to require users to download a large index.yaml of the whole repository just to get one Chart. By wrapping your Helm Chart in OCI, Helm repositories can now use OCI APIs instead, including the OCI index/tagging structure, which is much easier and more efficient.

Artifactory, which has supported Helm with dedicated repositories for many years, will now allow you to choose which flavor of Helm repo you want to create: good old Helm (which we refer to as “legacy Helm”) or Helm OCI. You’ll also get set-me-up instructions tailored to both options.

Regardless of which Helm repository type you leverage, Artifactory’s unified package search experience will show both Helm and Helm OCI results in queries, while the artifact search will allow you to segment your search to each repository type (Helm & Helm OCI).

In addition, account admins can control the UX flow. Artifactory will default to offer Helm OCI when creating new repositories, but can be switched to show the legacy Helm option first.

Alternatives to storing charts in Helm repositories

At this point you might be asking yourself, “Couldn’t I just upload Helm OCI to my existing OCI repositories?” The answer is: yes, you could, but you’d miss out on the Helm client set-me-up instructions and dedicated icons that help you visually identify the contents of the repository. The artifacts will also be identified in Artifactory as “OCI” and not “Helm OCI” in the artifact tree view, which is less clearly organized and searchable.

Along those lines, you could technically upload any OCI artifact into the Helm OCI repo. The issue with that scenario is that it would be under the Helm OCI repository, making the context less clear for other users searching for artifacts. For that reason, we recommend against doing this.

Learn more

For more on Helm OCI support in Artifactory, be sure to check out the Helm OCI documentation. Not an Artifactory user, but want to try out Helm OCI repositories? Sign up for a free trial.

Adopt a “Release-first” Approach with Release Lifecycle Management in JFrog Artifactory

zoer — Wed, 13 Sep 2023 10:00:27 +0000

UPDATE: As of swampUP 2023, Release Lifecycle Management includes the ability to create Xray policies to block promotion and/or distribution of Release Bundles that contain malicious packages, CVEs, etc. Read on for more information about Release Lifecycle Management.

Every organization has a process for building and releasing software. Smaller organizations may run a few automated tests before releasing, while larger organizations may have 100s of scans, validations, and approvals spanning everything from technical to legal. Whatever the process is, the end goal is the same: software that’s mature enough for release.

The challenge is that this process is complicated, messy, and often created in an ad hoc way, changing as organizations evolve. Different tools are used across different teams and stages of the SDLC. There may be manual steps and custom integrations between tools, and multiple packages and components to wrangle together to perform the release.

In a world where threats exist outside and within organizations, and software supply chain attacks have increased by 40% from 2021 to 2022, this status quo won’t suffice. Teams need a single source of truth that gives them full visibility into the release process, controls that ensure the integrity of the software being released, and automation wherever possible to lessen the inherent risk of human error.

Building Trust and Automating Software Releases

At JFrog, we’re establishing a “release-first” approach to software supply chain management. By focusing on the outcome of software development (i.e. the software release), we can build backwards by first asking the question, “What does it take to release verifiably secure and trusted software?”

Here’s our answer: there are six core components necessary to serve as the single source of truth for secure, trusted software released for consumption:

Defining the release with all of the included packages and components of varying technologies as an immutable entity as early as possible
Configuring “environments” that match an organization’s release lifecycle stages and containing the necessary repositories for the components contained within a release
Capturing evidence in a single place of actions taken to ensure security and quality by the various teams and tools leveraged across the SSC
Ability to seamlessly promote a release from one environment to another, and not rebuilding at any point in the release process
Policies to control how or when a release advances, including security checks, license validations, and operational requirements
Distributing a release where needed for consumption, and ensuring a trusted chain of custody to the very “last mile” of software delivery

Announcing: Release Lifecycle Management in JFrog Artifactory

JFrog has long supported organizations to better manage how software matures through the software supply chain. First came JFrog Artifactory, next came Build Info, and later we introduced the concept of a Release Bundle. Together, these tools have been adopted by some of the largest, most sophisticated software organizations in the world.

Release Lifecycle Management in JFrog Artifactory is a big step forward in our release-first approach. With Release Lifecycle Management comes new and enhanced Artifactory capabilities that provide the building blocks to standardize release management and better secure your software supply chain.

Release Lifecycle Management builds on this existing toolset and takes it one abstraction layer higher, making it easier for organizations to adopt these best practices while laying the groundwork for even more powerful capabilities to come.

What Release Lifecycle Management means for JFrog users

One of the biggest changes associated with this release is the introduction of our updated Release Bundle, which we’re calling Release Bundle v2 (RBv2). RBv2 brings the Release Bundle concept into Artifactory and allows users to define the potential release as a single immutable entity composed of multiple build outputs, packages, and files early in the SDLC. That entity can then be treated as a single unit as it advances towards distribution, production, or consumption.

Today, Release Lifecycle Management enables organizations to:

Configure custom environments to align with their release lifecycle stages and assign the necessary repositories for the components contained within a release
Create a signed, immutable Release Bundle (RBv2) early in the SDLC, which defines the potential release candidate with all of the included packages and components of varying technologies
Promote a Release Bundle to a target environment (SDLC Stage) without the need for custom scripts
Create and apply Xray policies to block promotion and/or distribution of release bundles
Distribute a Release Bundle to a Distribution Edge for optimized consumption
Capture metadata for traceability and reporting of actions taken against the Release Bundle

By adopting these new Release Lifecycle Management capabilities, DevOps teams benefit from:

Enhanced traceability, including the status of every release (i.e. who promoted, what stage it’s in, when it was promoted)
Easier automation due to higher level abstractions (i.e. you tell Artifactory where you want to promote the Release Bundle and it takes care of the rest)
Stronger security and reduced chance of mistakes because everything is encapsulated in an immutable, signed Release Bundle (i.e. there are no left overs or partial releases, so you can’t break the release as it matures)
A single system of record for all software your teams create

Best of all, portions of these new capabilities are available across JFrog subscription tiers, so even Pro accounts can take advantage of them. And like all JFrog capabilities, actions such as create Release Bundle, promote and distribute Release Bundle can be triggered from within our UI, CLI, or via CI tools thanks to our CI plugins.

What’s next for Release Lifecycle Management

As exciting as these new capabilities are, we’re even more excited about what this functionality enables us to build.

New dashboards to easily visualize the status of all your releases, including bottlenecks, are already underway. We’re also starting to look at collecting and storing third-party evidence of actions taken against software as it matures.

But we can’t create this world in a vacuum. We need your guidance to understand how you’re using these features, and what valuable use cases they power. So, try out these new capabilities and give us your feedback. Together, we’re building the ultimate platform for software supply chain management.

A Guide to Installing the JFrog Platform on Amazon EKS

zoer — Wed, 26 Jul 2023 19:55:17 +0000

Amazon Elastic Kubernetes Service — or Amazon EKS — is a managed service that enables you to run Kubernetes on AWS without having to manage your own Kubernetes clusters. The JFrog Platform is available on AWS, making it super simple to deploy applications reliably and predictably, scale them quickly, roll out new features easily, and make the most of hardware resources.

Did You Know?

Artifactory can serve as your Kubernetes Docker Registry.

It can provision your k8s cluster with the charts and images needed to orchestrate your application. Learn more on how you can easily configure Artifactory as your Kubernetes registry for EKS.

This blog post outlines the prerequisites and steps required to install and configure the JFrog Platform in Amazon EKS, including setting up two AWS systems: IAM Roles for Service Accounts (IRSA) and Application Load Balancer (ALB). By following the sequence laid out here, you can standardize Artifactory installation and ensure compliance with your internal rules. And since the configuration is based on official JFrog Helm Charts, you’re always supported in case you run into any issues.

Workflow to use Artifactory in Amazon EKS

Overview

Prerequisites
Step 1: Set up the IRSA
Step 2: Set up the ALB
Step 3: Configure the JFrog Platform Helm Chart
Step 4: Set up the storage
Step 5: Deploy the JFrog Platform Helm Chart

Prerequisites

Before you’re ready to configure ALB and IRSA in AWS, you’ll need the following requirements:

An up and running Elastic Kubernetes Service cluster.
TLS Certificate stored in AWS Certificate Manager (optional)
Amazon S3 bucket configured with a role and policy
JFrog Platform subscription

Docker subdomain access

The Docker subdomain method requires URL rewriting, which isn’t supported by the ALB, so you’ll have to set up a reverse proxy such as Nginx. In order to perform as intended, the ALB will have to redirect the HTTP/S requests to the Nginx Server. This set-up process isn’t covered in this blog post. Find out more about accessing a Docker registry hosted in Artifactory here >

Once these prerequisites are in place, you’re ready to get started!

Step 1: Set up the IRSA

To enable the IRSA, you’ll first need to configure an Open ID Connect provider in your Amazon EKS cluster. Then, configure it to “trust” the Artifactory service account (which you’ll create later).

Note: make sure to decide upon the name of the K8S namespace that Artifactory will be deployed to.

Example configuration that you can apply to your OIDC connector via the AWS CLI – you’ll need to update the parameters per your specifications:

{
  "Version": "2012-10-17",
  "Statement": [
      {
          "Sid": "",
          "Effect": "Allow",
          "Principal": {
               "Federated": "arn:aws:iam:::oidc-provider/oidc.eks..amazonaws.com/id/"
          },
          "Action": "sts:AssumeRoleWithWebIdentity",
          "Condition": {
              "StringEquals": {
                  "oidc.eks..amazonaws.com/id/:aud": "sts.amazonaws.com",
                   "oidc.eks..amazonaws.com/id/:sub": "system:serviceaccount::artifactory"
              }
          }
      }
  ]
}

Step 2: Set up the ALB

For the ALB, you’ll use the AWS Load Balancer Helm Chart to create an Ingress Controller in the EKS cluster. An Ingress Controller is a load balancer that’s specialized for Kubernetes and other containerized environments. You can find the AWS Load Balancer instructions here.

Step 3: Configure the JFrog Platform Helm Chart

Once the Load Balancer Controller is deployed, you can configure the JFrog Platform Helm Chart to spin up the ALB via the ingress section.

Use this values.yaml file to deploy the Helm Chart:

global:
...
artifactory:
 ...
 artifactory:
 ...
 ingress:
   enabled: true
   routerPath: /
   artifactoryPath: /artifactory/
   className: alb
   annotations:
     alb.ingress.kubernetes.io/scheme: internet-facing
     alb.ingress.kubernetes.io/target-type: ip
     alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}, {"HTTPS":443}]'
     alb.ingress.kubernetes.io/ssl-redirect: '443'
     alb.ingress.kubernetes.io/ssl-policy: ELBSecurityPolicy-2016-08
     # name of the future ALB
     alb.ingress.kubernetes.io/load-balancer-name: yann-demo-lab-eks
     # consume the TLS certificates in AWS Cert Manager
      alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm::
:certificate/

Step 4: Set up the storage

Artifactory requires a license to store artifacts in an Amazon S3 bucket.

First, store your license(s) in a text file. In the case of multiple licenses (i.e. when configuring an Artifactory HA cluster), make sure each license is separated by a new line.

Configure the chart to consume the secret.

License section in the values.yaml file:

global:
...
artifactory:
 ...
 artifactory:
   license:
     secret: artifactory-licenses
     dataKey: artifactory.cluster.lic

Next, you’ll configure the Helm Chart to create the Artifactory K8S service account and attach a role with an annotation.

The storage configuration will use the s3-storage-v3-direct filestore template which is recommended on AWS. Make sure to set the useInstanceCredentials to “true”, which means that you don’t have to specify the credentials in the binarystore.xml. See the authentication mechanism.

Storage section in the values.yaml file to configure the Helm Chart:

global:
...
artifactory:
 ...
 serviceAccount:
   create: true
   annotations:
     eks.amazonaws.com/role-arn: arn:aws:iam:::role/
   automountServiceAccountToken: true
 artifactory:
   license:
     secret: artifactory-licenses
     dataKey: artifactory.cluster.lic
   persistence:
...
     type: s3-storage-v3-direct
     awsS3V3:
       bucketName: 
       endpoint: s3..amazonaws.com
       region: 
        path: artifactory/filestore
       maxConnections: 50
       # in order to not specify identity and credential fields
       useInstanceCredentials: true
       testConnection: true

Step 5: Deploy the JFrog Platform Helm Chart

There are three sub-steps for deploying the JFrog Platform Helm Chart.

Sub-step 1: Create a namespace for where to deploy the JFrog Platform Helm Chart. Make sure to set the same name as the one configured in the OpenID Connect (OIDC) provider.

kubectl create ns

Sub-step 2: Deploy the license as a secret.

kubectl create secret generic artifactory-licenses --from-file=artifactory.cluster.lic=license.txt -n

Sub-step 3: Deploy the Helm Chart.

helm upgrade artifactory --install -n jfrog/jfrog-platform

This final sub-step will deploy both Artifactory and the ALB with an auto-generated DNS (A record). If you’ve attached a TLS certificate, you’ll also want to add a new CNAME record in Route 53 that points to the auto-generated DNS. This will link your Load Balancer to your DNS, enabling you to access Artifactory.

What is Platform Engineering?

lorenk — Tue, 02 May 2023 14:21:25 +0000

If DevOps is an approach to software development that emphasizes collaboration between Development and Operations teams, then Platform Engineering operationalizes that approach by creating a centralized platform that has specific sets of tools and processes. It’s the discipline of designing and building toolchains and workflows that enable self-service capabilities for software engineering organizations in a cloud-native era. It focuses on the interconnectivity between components and how they affect the overall development process.

In my recent webinar “Platform Engineering: Just a fad -or- the remedy to modern development headaches?” I discuss the core tenets of platform engineering including where and how it makes sense to adopt this approach in your organization. Watch the webinar below.

But in case you missed it, here are some key takeaways, including how platform engineering can enhance your software development lifecycle (SDLC).

What are the key components of platform engineering?

The key components of Platform Engineering include designing a self-service workflow and creating a streamlined, automated platform that caters to software developers and development teams. This is the fundamental idea of using an Internal Development Platform to help with self-service. As a result, platform engineering can help alleviate the cognitive load faced by developers and enable organizations to operate more efficiently.

Platform engineering emphasizes designing the actual build environment and build system to be self-service, while focussing on delivering a platform for the software developers and the development teams. But for platform engineering to work, collaboration across teams is vital. You want to treat your internal customers like customers and constantly adjust the platform to meet the needs of the organization. By focusing on the entire software development lifecycle (SDLC) and incorporating team feedback, platform engineering can lead to more effective and efficient software development processes. This process isn’t a one-time process but a constant evaluation and revaluation to make sure that it’s always evolving and meeting the team’s needs.

What are the best strategies for addressing distributed teams and hybrid environments in platform engineering?

Platform engineering can work with distributed teams by using replication and distribution techniques, which can help maintain consistency across organizations, regardless of their location. The key is to have a solid base foundation for platform engineering and to consolidate that information for easy access and collaboration between teams. By treating platform engineering as an evolving process, organizations can better adapt to the changing needs of their teams and maintain a more efficient and streamlined approach to software development.

What are the best practices for platform engineering adoption?

Here are a few best practices for adopting platform engineering at your organization:

Take into account the feedback of developers. Platform engineering and development should collaborate together, and platform engineers should seek input from developers to make sure the appropriate DevOps tools are acquired.
Don’t attempt to manage all aspects of the system; rather, find a balance and generate trust.
Foster a DevOps-friendly culture, one where work is enjoyable and communication is open.
View the platform as a product which must be marketed to developers, and dedicate resources to streamlining the experience associated with the platform.
Incorporate security tools into platform engineering, and organizations should evaluate their toolsets and make sure they’re adhering to security best practices. This includes assessing potential vulnerabilities, monitoring compliance, and automating security processes where possible.

By focusing on these best practices, platform engineering can create an environment where teams move quickly and effectively together, leading to a successful DevOps culture.

What are the platform engineering capabilities of JFrog?

JFrog Software Supply Chain Platform is excellent for platform engineering because it offers a comprehensive set of tools and features that are designed to help platform engineers build, manage, and deploy applications quickly and efficiently. The platform engineering capabilities of JFrog includes JFrog Artifactory, JFrog Xray, JFrog Distribution, JFrog Connect, and JFrog Pipelines:

Artifactory is a universal artifact manager that helps manage 85 to 90% of the software needed to complete a project. It also provides extra layers of security, accountability and auditability, as well as metadata to automate tasks.
Xray provides supply chain security with contextual analysis to determine which vulnerabilities are applicable and which are not.
Distribution helps take secure software components where needed for consumption whether internally or externally.
Connect, which is an IoT solution, allows users to deploy web services, updates to vehicles, and perform remote diagnostics.
Pipelines is a CI/CD orchestration tool that can be used on its own, or as an endpoint for other tools.
Finally, the JFrog Software Supply Chain platform is API-able, scalable, replication-friendly, and provides the ability to trace software throughout its lifecycle.

Overall, JFrog’s platform engineering capabilities bring an end-to-end SDLC solution to developers, providing the security, automation, and traceability needed to build and deploy software efficiently. It also provides the ability to face issues quickly while being able to do so on a global scale.