Ping Identity - The Challenge of 'Run'

Most IT professionals are familiar with the typical cycle of IT projects, which usually run through ‘design’ and ‘build’ phases before moving into production in ‘run’.

At ProofID, our experience is that many organizations focus a lot of time an energy on the first two stages, ensuring that the system is architected appropriately and built to address all functional and non-functional requirements. However, often, the challenges of the ‘Run’ stage are not considered up front, leading to significant operational difficulties and unplanned costs as the service enters production. We have found this to be equally true for Ping Identity projects; in this series of blogs, we will explore the challenges faced by organizations running Ping Identity deployments in production, and outline some possible solutions to ensure organizations get the best out of their investment in the design and build stages of their deployment.

Ping Identity makes fantastic Identity and Access Management products, and one of their best characteristics is their stability in run – Ping servers failing or crashing is not a common problem. However, even with a stable platform, there are significant challenges involved with maintaining a complex enterprise platform such as PingFederate or PingAccess; furthermore the nature of these products means that any outages will lead to significant disruption, as users and customers will not be able to authenticate to key applications – in some industries such outages can come with a significant price tag. So, what are the issues which can lead to problems in run?

Hard to find, niche skills

Identity and Access Management, whilst being a pervasive technology, requires a deep level of niche skills in order to manage and troubleshoot a platform effectively. Not only are vendor specific skills required, but additionally engineers need a thorough understanding of relevant standards and protocols such as SAML, OpenID Connect and SCIM. As an integration technology, often faults in the wider IAM ecosystem may not be related to the IAM system itself; however, to isolate the problem, the issue must be tracked from source – requiring the ability to understand at the transaction level how authentication is processed. Such skills can be expensive and hard to find, and aren’t always factored into the total cost of ownership for an IAM platform.

Staging of configuration on the ‘route to live’

Most modern organizations maintain multiple replicated environments on the ‘route to live’. Typically, this will include one or more development environments, with configurations then being staged into pre-production and production environments. When a new connection is added into a Ping environment, for example to provide SSO to a new application, the configuration will typically be created and tested in the development environment, before being replicated and tested in pre-production and finally production.

The nature of the underlying protocols means that such configuration changes require many configuration steps – sometimes as many as fifty individual changes may be required to integrate a new application. If carried out manually, the configuration must be painstakingly documented prior to being staged through the environments – ProofID has experienced such documents stretching to over 70 pages for a single change. Not only is this highly inefficient, but there is a high chance of human error being introduced, either in production of the documentation or in its execution. Such errors will at the least lead to delays for troubleshooting, and at their worst could lead to an outage due to misconfiguration. Where outages interrupt business, this isn’t an option.

Platform insights

Understanding how the IAM platform is performing is a key element in ensuring good service in run. Monitoring usage patterns and associated performance is essential to ensure fast response times and an optimal user experience. Additionally, identifying underlying error conditions which may not be immediately obvious can be priceless in terms of avoiding future issues and outages.

Out of the box, Ping Identity products provide a tremendous amount of performance and troubleshooting data, however these are spread across multiple log files which can be difficult to read, particularly in resilient and distributed clustered environments. Aggregating log files to a central database can help, however there is still a requirement to analyze and interpret the data which can be challenging.

In many deployments, these difficulties mean that analysis of logs is something which is done ‘after the fact’ to understand the causes of an issue, rather than something which is done proactively to prevent occurrence of issues. To optimize the run experience, proactive analysis and monitoring of performance should be a core activity.

Always on support

In modern, global enterprises, the IAM system becomes a central part of the organization’s fabric, processing authentication to all organizational applications and assets. Between workforce, customers and external users, authentication never stops, meaning that ‘always on’ support is required around the clock.

Even with platforms as stable as Ping Identity, issues and outages will occur, and as an integration platform, the IAM system will often surface issues first, even if the underlying cause lies elsewhere. For example – if Active Directory is unavailable, this may first become visible as users being unable to SSO into applications. This may turn into an incident in the middle of the night reported as ‘SSO is down’.

Having access to technical support whenever it is needed, with real understanding of the local deployment rather than just the technology, is a key requirement for enterprise IAM systems in run. However, sometimes this can be missed until it is needed.

Summary

In this blog, we have identified some of the common challenges facing enterprises as they manage Ping Identity technologies in ‘run’. From sourcing suitably skilled engineers to managing staging of configuration, there are many ways in which the quality of the service being offered can be compromised.

In the second blog in this series, we will focus on the ‘Route to Live’ with ProofID ConfigMigrator.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
LS_CSRF_TOKEN	session	Cloudflare sets this cookie to track users’ activities across multiple websites. It expires once the browser is closed.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
bscookie	2 years	LinkedIn sets this cookie to store performed actions on the website.
lang	session	LinkedIn sets this cookie to remember a user's language setting.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_47704508_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.

Cookie	Duration	Description
_zcsr_tmp	session	No description available.
663a60c55d	session	No description available.
ac09458e72	session	No description
AnalyticsSyncHistory	1 month	No description
CookieLawInfoConsent	1 year	No description
fd6b13af5c	session	No description available.
li_gc	2 years	No description
proofidltd-_zldp	2 years	No description
proofidltd-_zldt	1 day	No description
visitorId	1 year	No description
zc_consent	1 year	No description available.
zc_cu	1 year	No description available.
zc_cu_exp	1 year	No description available.
zc_loc	session	No description available.
zc_show	1 year	No description available.
ZCAMPAIGN_CSRF_TOKEN	session	No description available.