Exploring social media for OSINT (SOCMINT)
SOCMINT, the more nuanced sibling of OSINT, has particular expertise in unearthing information from the bustling world of social media platforms. While OSINT is quite content to peruse publicly available data, SOCMINT isn’t afraid to venture a bit further, potentially accessing information that was initially shared within more closed circles on social media platforms.
Social media platforms have transformed into bustling hubs of information where individuals and organizations share updates, opinions, and news. These platforms have become essential tools in the OSINT toolkit, offering a ton of data that can be analyzed for many purposes. Let’s delve into some popular social media platforms and how they can be utilized as OSINT goldmines:
- X (formerly Twitter) (twitter.com):
- The buzz station: Twitter is pretty much the hotspot for all the latest gossip and breaking news. It’s where OSINT analysts hang out to grab the newest scoops on just about everything currently happening.
- Search, but with style: Twitter’s got this cool search feature that lets you sift through tweets with all kinds of filters, making it a piece of cake for OSINT analysts to nail down the exact information they’re after.
- API coolness: Oh, and Twitter’s API integration is a real time saver, helping researchers to gather a whole bunch of data without breaking a sweat.
- Facebook (facebook.com):
- Chat rooms galore: Facebook is this giant hangout spot with a ton of communities and forums where people share information on a wide array of topics. It’s a goldmine for OSINT analysts looking to dig up some fresh data.
- Marketplace hustle: The Facebook Marketplace is buzzing with activity, with folks buying and selling stuff left and right. It’s a great place for analysts to catch up on the latest market trends and consumer behaviors.
- Event buzz: Then there’s Facebook Events, a handy tool for creating and promoting events. It’s a neat feature for researchers to gather some intel on different events and the crowd that’s attending.
- LinkedIn (linkedin.com):
- Networking hub: LinkedIn is basically the hub for professional connections, offering a treasure chest of information on companies, industries, and the movers and shakers in the business world, making it a hotspot for business research.
- Job board: It’s also the place to check out job postings and get a peek into what’s happening in the job market and industry trends.
Note
Job boards are my happy place, as I can gather so much intel on a target just from their job listings.
- Content central: LinkedIn users are always sharing a bunch of stuff, from articles to presentations, making it a rich source of information for OSINT researchers to dive into.
- Instagram (instagram.com):
- Visual wonderland: Instagram is all about visuals, offering a sea of images and videos that are perfect for different kinds of analyses, from spotting trends to analyzing sentiments.
- Tagging magic: Instagram has cool features such as hashtags and geotags that help users categorize and find content, making it a handy tool for OSINT researchers to track down specific information.
- Influencer watch: Instagram is also home to a bunch of influencers who have a big say in shaping opinions. Keeping an eye on their content and strategies can give you a peek into social media trends and what the public is thinking.
OK, now that we know the intel type we’re dealing with, let’s see what we can do with it.
SOCMINT concepts – a deep dive
It’s essential to grasp the various concepts and terminologies that form the backbone of SOCMINT. Let’s dig in here:
- Details about a user’s profile: This static data provides a brief overview of the identity a user presents online, either personally or professionally. A user’s LinkedIn page might include information about their current job as a “Windows Admin,” as well as their educational background and skill sets such as “Exchange 2013” or “SharePoint.” User’s hobbies and the kind of information they typically engage with may be displayed on their Twitter profile, providing insight into their personality and preferences.
- Interactions: This aspect covers the dynamic activities users engage in on social media platforms. For example, on Facebook, a user might be actively participating in a group discussion about the latest developments in cybersecurity, sharing insights or resources. On platforms such as Instagram, interactions could involve commenting on a post about wireless security tools or sharing a story that highlights a recent webinar on ethical hacking. These interactions are a goldmine of data, offering a real-time view of users’ opinions, discussions, and the content they resonate with.
- Metadata: This refers to the contextual data that accompanies the primary content shared on social media platforms. For instance, a photo that has been posted might have all kinds of metadata goodies, such as the geographical location, the time and date of the post, and the type of device used for the upload. This metadata can offer deeper insights into users’ behaviors and patterns, helping researchers to build a more comprehensive profile of individuals or groups.
Note
The information that folks don’t understand they’re exposing on social media sites is staggering. Some people expose everything from their name, gender, email addresses, and kids’ names. But the really scary thing for me is when people post photos of their kids online. Why? Well, rule number 1 is once it’s on the internet you cannot take it back. Rule 2, parents have a tendency to list their children’s names. Photos often contain information you didn’t recognize as being something that might put them in danger, such as a photo of their first day of school in front of their house, which might have the house address on it. OK, I’m getting off my security soapbox.
There are two types of data in this world
Technically, there are two different types of data that you’ll find:
- Explicit information: This is the kind of data that users willingly share on their social media platforms. It’s the straightforward, no-beating-around-the-bush kind of information that forms the visible layer of a user’s online persona. For instance, a user might openly share their views on the latest developments in the cybersecurity world in a Tweet, or post a LinkedIn update about a recent certification they’ve achieved in ethical hacking. This category also includes profile details such as job titles, educational backgrounds, and the groups or communities they are part of. Explicit information serves as a direct window into a user’s professional life, interests, and opinions, offering a clear-cut view of their online activities and engagements.
- Implicit information: This is the more subtle, often unintentional, data that users reveal online. Let me give you an example: by analyzing the patterns of a user’s likes and shares on platforms such as Facebook, researchers might be able to deduce their preferences, affiliations, or inclinations toward certain topics or communities. Similarly, the metadata attached to the content they share, such as location tags or device types, can offer clues about their habits, locations they frequent, or the kind of devices they use. Implicit information, therefore, serves as a rich reservoir of insights that can help researchers build a more nuanced and comprehensive profile of individuals, often revealing patterns and details that are not immediately apparent.
By harnessing both explicit and implicit information, SOCMINT researchers can weave together a detailed tapestry of insights, helping them to conduct more rounded and insightful investigations. It’s like putting together the pieces of a puzzle, where each piece, whether explicit or implicit, adds a new dimension to the overall picture.
You need to “Sherlock it”
This is a phrase my kids are sick of. Whenever they ask me a question that I think they should be able to use common sense to figure out the answer to, I say “Sherlock it.” Imagine how happy I was when this tool came out!
Sherlock is this handy-dandy tool that helps you hunt down a person’s or organization’s username across a bunch of social media platforms. It’s like your personal detective that lets you pop in a name or username and then scours the internet to find any social media profiles that might be linked to that name or username.
Let’s give it a whirl. These are the instructions to install it on a Kali box, which is my go-to system for OSINT:
- Update your system: Before we invite Sherlock into our digital abode, let’s make sure everything is spick and span. Open your terminal and run the following command to update your system:
sudo apt update && sudo apt upgrade -y
- Install Git: Now, we need to ensure that Git, the popular version control system, is installed on your Kali system. It’s like the digital toolbox for our Sherlock installation. Run this command to install Git:
sudo apt install git -y
This command politely asks your system to install Git and automatically agrees to the terms and conditions (hence,
-y
). - Clone Sherlock repository: Next, we’re going to clone the Sherlock repository from GitHub. It’s like Sherlock’s home where he keeps all his detective tools. Use this command to clone the repository:
git clone https://github.com/sherlock-project/sherlock.git
- Install requirements: Before we can start using Sherlock, we need to set up the environment properly. Move into the Sherlock directory and run this command to install the requirements:
python3 -m pip install -r requirements.txt
- Now that Sherlock is installed, let’s give it a whirl. Just use this simple command:
sherlock <username>
Now, keep in mind that the results might be a hit or miss. For a little sneak peek, here’s a screenshot of a search for the username
dalemeredith
:
Figure 3.5 – Using Sherlock against the username dalemeredith
Whether you’re on the hunt for clues about a username or just trying to connect the online dots, Sherlock’s your go-to buddy. It’s straightforward, no-nonsense, and, honestly, a bit of a game changer for anyone in the cyber-sleuthing biz.
Hashtags and geolocations
When it comes to digging up some golden nuggets of information online, using hashtags and geolocation can be your best pals. It’s like having a secret map that leads you straight to the treasure of data you’re after.
These little guys, marked by the #
symbol, are like the breadcrumbs that lead you to the heart of buzzing conversations and trending topics. Picture this: you’re keen on keeping tabs on the latest chatter about, oh, I don’t know, how about the coolest hero ever. Let’s pick someone with no superpowers, someone who runs around at night in a cape. I know; let’s say Batman. Just pop into your favorite social media platform and search #Batman
, and bam! You’re now in the epicenter of all the latest discussions and insights. It’s a fantastic way to stay in the loop and gather real-time intel. Here’s how to wield hashtags effectively:
- Trend analysis: Tools such as Hashtagify (https://hashtagify.me/) and RiteTag (https://ritetag.com/) can be your guiding stars in the hashtag universe. They help you pinpoint trending hashtags and even suggest the optimal ones to amplify your reach. It’s like having a backstage pass to the hottest topics creating waves online.
- Community engagement: Platforms such as TweetDeck (https://tweetdeck.twitter.com/) can be your command center for monitoring and engaging with communities revolving around specific hashtags. It’s your gateway to immerse yourself in niche discussions and gather firsthand insights from the epicenter of the conversation.
- Content Discovery: BuzzSumo (https://buzzsumo.com/) is your trusty companion in the quest for discovering popular content based on hashtags. It’s your magnifying glass to zoom in on the content that’s garnering attention in your field of interest.
Then we have geolocations
A geolocation is the digital equivalent of saying, “X marks the spot!” You can pinpoint the geographical coordinates (latitude and longitude) of any device connected to the internet. Any device: your phone, tablet, or even your nifty watch. This fascinating technology branches into three main types: server-based, device-based, and combined data collection:
- Server-based data collection: The IP sleuth
This approach is akin to having a digital sleuth that traces the physical locations linked to IP addresses, thanks to years of data mining. However, the accuracy is as good as the data provided by third-party service providers, sometimes making the data integrity a bit of a guessing game. It’s like having a map with ever-changing landmarks where the service providers dictate the standards and offer customized geolocation solutions.
- Device-based data collection: The GPS whisperer
This method is like having a mini detective in your pocket, constantly whispering the whereabouts of your device. It primarily relies on GPS and cellular networks, offering more accuracy in densely populated areas. However, in less-populated regions, it might get a bit directionally challenged, leading to data delays or gaps. It’s essential to enable location detection on each device and app to make the most of this method but remember that it’s always a good idea to keep an eye on privacy concerns.
- Combined data collection: The best of both worlds
This method is like having a dynamic duo of detectives, combining device-based and server-based detection strengths to offer a more comprehensive insight. It ensures a better user experience by providing a fallback option if one data collection method fails, making it a reliable choice for websites aiming to enhance visitor interaction.
- Geolocation tracking in the real world
Google was actively working with law enforcement organizations to support the investigations into the incidents that occurred in the United States on January 6, 2021. Geofencing warrants, which allow the authorities to ask for information on devices that were in a specified location at a particular time, make this collaboration easier. Although this technique has been successful in locating suspects, it has also raised concerns about privacy because it may also collect information on innocent people. Google is negotiating this tricky situation in an attempt to strike a balance between aiding in the investigations and protecting user privacy.
This incident highlights the crucial part that tech firms can play in advancing criminal investigations in the current digital era. The delicate balance between assisting law enforcement and protecting user privacy is also highlighted. Keeping this balance is essential for maintaining public trust and cooperation. A harmonic balance must be achieved as technology develops in order to guarantee that justice is done without jeopardizing people’s privacy. This indicates to me a future where technology and law enforcement may combine more regularly, but with required safeguards in place to preserve individual rights; at least, that’s my hope. I’m still scared anytime I see any government getting involved with technology.
The magical world of EXIF data
The EXIF data in a photo is really the star of the show. This tiny genius stealthily infuses itself into the photos you take with your phone or digital camera, storing a wealth of data such as its GPS coordinates that may be used to retrace your steps to the precise spot where the shot was taken. This must be a breeze, right? Nevertheless, here’s the catch: in order to safeguard users’ privacy, most websites and social media sites eliminate this information. Hence, a complete EXIF image is as rare as a needle in a haystack while searching the internet. You’re in for a real treat, though, if you can track down the original artwork. That being said, you can use tools such as Jimpl (Jimpl.com) to extract EXIF of a photo. Here, let me upload one for you:
Figure 3.6 – Jimpl.com looking at my photo information
Now, you’ll notice that my location’s information wasn’t available; this is because I’ve turned location tracking off for my photos as well as for this particular photo, which was taken at Blackhat, and my phone is locked down pretty tight. But look at the metadata that it did pull:
Figure 3.7 – Jumpi even detected my camera settings
ExifTool is a popular program used by those who need to look into or edit the metadata of multimedia files. Now, ExifTool is built into Kali, so let’s fire up our Kali systems and see what it can do.
Your first step into the world of metadata manipulation is learning how to read the existing metadata in a file. You can do this by using a simple command:
exiftool yourmasterpiece.jpg
This command will reveal all the metadata secrets that yourmasterpiece.jpg
holds, including details such as the camera settings used and the date it was created.
Figure 3.8 – Results from ExifTool on the photo “Dale at Blackhat 2023.jpg”
As you grow more comfortable with ExifTool, you’ll discover it has a plethora of advanced features waiting to be explored. From handling multiple files at once to copying metadata between files, the possibilities are vast. Dive into the documentation or use the -h
option to uncover more functionalities.