Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Conferences
Free Learning
Arrow right icon

Tech News

3711 Articles
article-image-t-sql-tuesday-retrospective-002-a-puzzling-situation-from-blog-posts-sqlservercentral
Anonymous
28 Oct 2020
1 min read
Save for later

T-SQL Tuesday Retrospective #002: A Puzzling Situation from Blog Posts - SQLServerCentral

Anonymous
28 Oct 2020
1 min read
For the second T-SQL Tuesday ever — again, hosted by Adam Machanic — we were asked one of three options, and I elected to go with the first one: Describe a confusing situation you encountered, and explain how you debugged the problem and what the resolution was. This invitation was originally posted on 4 January-> Continue reading T-SQL Tuesday Retrospective #002: A Puzzling Situation The post T-SQL Tuesday Retrospective #002: A Puzzling Situation appeared first on Born SQL. The post T-SQL Tuesday Retrospective #002: A Puzzling Situation appeared first on SQLServerCentral.
Read more
  • 0
  • 0
  • 719

article-image-daily-coping-28-oct-2020-from-blog-posts-sqlservercentral
Anonymous
28 Oct 2020
1 min read
Save for later

Daily Coping 28 Oct 2020 from Blog Posts - SQLServerCentral

Anonymous
28 Oct 2020
1 min read
I started to add a daily coping tip to the SQLServerCentral newsletter and to the Community Circle, which is helping me deal with the issues in the world. I’m adding my responses for each day here. Today’s tip is to plan a fun or exciting activity to look forward to. I’ve got a couple, but in the short term, my wife and I decided to take advantage of a gift from Redgate. For the 21st birthday of the company, everyone got a voucher for an “experience” in their area. In looking over the items, we decided to do a Wine and Painting night with our kids. They are artistic, and we like wine. Actually, my son can drink wine as well, so 3 of us will enjoy wine, with 2 painting. I’m looking forward to the night we can do this. The post Daily Coping 28 Oct 2020 appeared first on SQLServerCentral.
Read more
  • 0
  • 0
  • 853

article-image-azure-databricks-and-azure-key-vault-from-blog-posts-sqlservercentral
Anonymous
28 Oct 2020
1 min read
Save for later

Azure Databricks and Azure Key Vault from Blog Posts - SQLServerCentral

Anonymous
28 Oct 2020
1 min read
The key vault should always be a core component of your Azure design because we can store keys, secrets, certicates thus abstract / hide the true connection string within files. When working with databricks to mount storage to ingest your … Continue reading ? The post Azure Databricks and Azure Key Vault appeared first on SQLServerCentral.
Read more
  • 0
  • 0
  • 902

article-image-tableau-migrates-to-the-cloud-how-we-evaluated-our-modernization-from-whats-new
Anonymous
26 Oct 2020
7 min read
Save for later

Tableau migrates to the cloud: how we evaluated our modernization from What's New

Anonymous
26 Oct 2020
7 min read
Erin Gengo Manager, Analytics Platforms, Tableau Tanna Solberg October 26, 2020 - 9:31pm October 29, 2020 Cloud technologies make it easier, faster, and more reliable to ingest, store, analyze, and share data sets that range in type and size. The cloud also provides strong tools for governance and security which enable organizations to move faster on analytics initiatives. Like many of our customers, we at Tableau wanted to realize these benefits. Having done the heavy lifting to move our data into the cloud, we now have the opportunity to reflect and share our migration story. As we embarked on the journey of selecting and moving to a cloud-driven data platform from a conventional on-premises solution, we were in a unique position. With our mission to help people see and understand data, we’ve always encouraged employees to use Tableau to make faster, better decisions. Between our culture of democratizing data and rapid, significant growth, we consequently had servers running under people’s desks, powering data sources that were often in conflict. It also created a messy server environment where we struggled to maintain proper documentation, apply standard governance practices, and manage downstream data to avoid duplication. When it came time to migrate, this put pressure on analysts and strained resources. Despite some of our unique circumstances, we know Tableau isn’t alone in facing some of these challenges—from deciding what and when to migrate to the cloud, to how to better govern self-service analytics and arrive at a single source of truth. We’re pleased to share our lessons learned so customers can make informed decisions along their own cloud journeys. Our cloud evaluation measures Because the cloud is now the preferred place for businesses to run their IT infrastructure, choosing to shift our enterprise analytics to a SaaS environment (Tableau Online) was a key first step. After that, we needed to carefully evaluate cloud platforms and choose the best solution for hosting our data. The top criteria we focused on during the evaluation phase were:  Performance: The platform had to be highly performant to support ad-hoc analysis to high-volume, regular reporting across diverse use cases. We wanted fewer “knobs” to turn and an infrastructure that adapted to usage patterns, responded dynamically, and included automatic encryption. Scale: We wanted scalable compute and storage that would adjust to changes in demand—whether we were in a busy time of closing financial books for the quarter or faced with quickly and unpredictably shifting needs—like an unexpected pandemic. Whatever we chose needed compute power that scaled to match our data workloads.  Governance and security: We’re a data-driven organization, but because much of that  data wasn’t always effectively governed, we knew we were missing out on value that the data held.. Thus, we required technology that supported enterprise governance as well as the increased security that our growing business demands.  Flexibility: We needed the ability to scale infrastructure up or down to meet performance and cost needs. We also wanted a cloud platform that matched Tableau’s handling of structured, unstructured, or semi-structured data types to increase performance across our variety of analytics use cases.  Simplicity: Tableau sought a solution that was easy to use and manage across skill levels, including teams with seasoned engineers or teams without them that managed their data pipelines through Tableau Prep. If they quickly saw the benefit of the cloud architecture to streamline workflows and reduce their time to insight, it would help them focus on creating data context and support governance that enabled self-service—a win-win for all. Cost-efficiency: A fixed database infrastructure can create large overhead costs. Knowing many companies purchase their data warehouse to meet the highest demand timeframes, we needed high performance and capacity, but not 24/7. That could cost Tableau millions of dollars of unused capacity. Measurement and testing considerations We needed to deploy at scale and account for diverse use cases as well as quickly get our people answers from their data to make important, in-the-moment decisions. After narrowing our choices, we followed that with testing to ensure the cloud solution performed as efficiently as we needed it to. We tested: Dashboard load times; we tested more than 20,000 Tableau vizzes  Data import speeds Compute power Extract refreshes How fast the solution allows our London and Singapore data centers to access data that we have stored in our US-West-2a regional data center  We advise similar testing for organizations like us, but we also suggest asking some other questions to guarantee the solution aligns with your top priorities and concerns: What could the migration path look like from your current solution to the cloud? (For us, SQL Server to Snowflake) What's the learning curve like for data engineers—both for migration and afterward? Is the cost structure of the cloud solution transparent, so you can somewhat accurately forecast/estimate your costs? Will the solution lower administration and maintenance?  How does the solution fit with your current development practices and methods, and what is the impact for processes that may have to change? How will you handle authentication? How will this solution fit with our larger vendor and partner ecosystem? Tabeau’s choice: Snowflake There isn’t a one-size-fits-all approach, and it’s worth exploring various cloud data platforms. We found that in prioritizing requirements and making careful, conscious choices of where we wouldn’t make any sacrifices, a few vendors rose to the top as our shortlist for evaluation.  In our data-heavy, dynamic environment where needs and situations change on a dime, we found Snowflake met our needs and then some. It is feature-rich with a dynamic, collaborative environment that brings Tableau together—sales, marketing, finance, product development, and executives who must quickly make decisions for the health, safety, progress of the business.  “This process had a transformational effect on my team, who spent years saying ‘no’ when we couldn’t meet analytics demands across Tableau,” explained Phillip Cheung, a product manager who helped drive the evaluation and testing process. “Now we can easily respond to any request for data in a way that fully supports self-service analytics with Tableau.”  Cloud adoption, accelerated With disruption on a global scale, the business landscape is changing like we’ve never experienced. Every organization, government agency, and individual has been impacted by COVID-19. We’re all leaning into data for answers and clarity to move ahead. And through these times of rapid change, the cloud has proven even more important than we thought. As a result of the pandemic, organizations are accelerating and prioritizing cloud adoption and migration efforts. According to a recent IDC survey, almost 50 percent of technology decision makers expect to moderately or significantly increase demand for cloud computing as a result of the pandemic. Meredith Whalen, chief research officer, said, “A number of CIOs tell us their cloud migration investments paid off during the pandemic as they were able to easily scale up or down.” (Source: IDC. COVID-19 Brings New C-Suite Priorities, May 2020.) We know that many of our customers are considering or already increasing their cloud investments. And we hope our lessons learned will help others gain useful perspective in moving to the cloud, and to ultimately grow more adaptive, resilient, and successful as they plan for the future. So stay tuned—as part of this continued series, we’ll also be sharing takeaways and experiences from IT and end users during key milestones as we moved our data and analytics to the cloud.
Read more
  • 0
  • 0
  • 611
Banner background image

article-image-apple-entrepreneur-camp-applications-open-for-black-founders-and-developers-from-news-apple-developer
Matthew Emerick
19 Oct 2020
1 min read
Save for later

Apple Entrepreneur Camp applications open for Black founders and developers from News - Apple Developer

Matthew Emerick
19 Oct 2020
1 min read
Apple Entrepreneur Camp supports underrepresented founders and developers as they build the next generation of cutting-edge apps and helps form a global network that encourages the pipeline and longevity of these entrepreneurs in technology. Applications are now open for the first cohort for Black founders and developers, which runs online from February 16 to 25, 2021. Attendees receive code-level guidance, mentorship, and inspiration with unprecedented access to Apple engineers and leaders. Applications close on November 20, 2020. Learn more about Apple Entrepreneur Camp Learn about some of our inspiring alumni
Read more
  • 0
  • 0
  • 2753

article-image-weekly-digest-october-19-from-featured-blog-posts-data-science-central
Matthew Emerick
18 Oct 2020
1 min read
Save for later

Weekly Digest, October 19 from Featured Blog Posts - Data Science Central

Matthew Emerick
18 Oct 2020
1 min read
Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week. To subscribe, follow this link.   Featured Resources and Technical Contributions  Best Models For Multi-step Time Series Modeling Types of Variables in Data Science in One Picture A quick demonstration of polling confidence interval calculations using simulation Why you should NEVER run a Logistic Regression (unless you have to) Cross-validation and hyperparameter tuning 5 Great Data Science Courses Complete Hands-Off Automated Machine Learning Why You Should Learn Sitecore CMS? Featured Articles AI is Driving Software 2.0… with Minimal Human Intervention Data Observability: How to Fix Your Broken Data Pipelines Applications of Machine Learning in FinTech Where synthetic data brings value Why Fintech is the Future of Banking? Real Estate: How it is Impacted by Business Intelligence Determining How Cloud Computing Benefits Data Science Advantages And Disadvantages Of Mobile Banking Picture of the Week Source: article flagged with a +  To make sure you keep getting these emails, please add  mail@newsletter.datasciencecentral.com to your address book or whitelist us. To subscribe, click here. Follow us: Twitter | Facebook.
Read more
  • 0
  • 0
  • 1175
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-genius-tool-to-compare-best-time-series-models-for-multi-step-time-series-modeling-from-featured-blog-posts-data-science-central
Matthew Emerick
18 Oct 2020
17 min read
Save for later

Genius Tool to Compare Best Time-Series Models For Multi-step Time Series Modeling from Featured Blog Posts - Data Science Central

Matthew Emerick
18 Oct 2020
17 min read
Predict Number of Active Cases by Covid-19 Pandemic based on Medical Facilities (Volume of Testing, ICU beds, Ventilators, Isolation Units, etc) using Multi-variate LSTM based Multi-Step Forecasting Models Introduction and Motivation The intensity of the growth of the covid-19 pandemic worldwide has propelled researchers to evaluate the best machine learning model that could the people affected in the distant future by considering the current statistics and predicting the near future terms in subsequent stages. While different univariate models like ARIMA/SARIMA and traditional time-series are capable of predicting Number of Active cases, daily recoveries, Number of deaths, they do not take into consideration the other time-varying factors like Medical Facilities (Volume of Testing, ICU beds, Hospital Admissions, Ventilators, Isolation Units, Quarantine Centres, etc). As these factors become important we build a predictive model that can predict the Number of Active Cases, Deaths, and Recoveries based on the change in Medical Facilities as well as other changes in infrastructure. Here in this blog, we try to model Multi-step Time Series Prediction using Deep learning Models on the basis of Medical Information available for different states of India. Multi-Step Time Series Prediction A typical multi-step predictive model looks as the below figure, where each of the predicted outcomes from the previous state is treated as next state input to derive the outcome for the second-state and so forth. www.tensorflow.org/tutorials/structured_data/time_series_files/output_0xJoIP6PMWMI_1.png?resize=429%2C280&ssl=1" alt="png" width="429" height="280" /> Source Deep Learning-based Multi-variate Time Series Training and Prediction The following figure illustrated the important steps involved in selecting the best deep learning model. Time-Series based Single/Multi-Step Prediction Feeding Multi-variate data from a single source or from aggregated sources available directly from the cloud or other 3rd-party providers into the ML modeling data ingestion system. Cleaning, preprocessing, and feature engineering of the data involving scaling and normalization. Conversion of the data to a supervised time-series. Feeding the data to a deep learning training source that can train different time-series models like LSTM, CNN, BI-LSTM, CNN+LSTM using different combinations of hidden layers, neurons, batch-size, and other hyper-parameters. Forecasting based on near term or far distant term in future either using Single-Step or Multi-Step Forecasting respectively Evaluation of some of the error metrics like (MAPE, MAE, ME, RMSE, MPE) by comparing it with the actual data, when it comes in Re-training the model and continuous improvements when the threshold of error exceeds. Import Necessary Tensorflow Libraries The code snippet gives an overview of the necessary libraries required for tensorflow. from tensorflow.python.keras.layers import Dense, LSTM, RepeatVector,TimeDistributed,Flatten, Bidirectional from tensorflow.python.keras import Sequential from tensorflow.python.keras.layers.convolutional import Conv1D, Conv2D, MaxPooling1D,ConvLSTM2D Data Loading and Selecting Features As Delhi had high Covid-19 cases, here we model different DL models for the “DELHI” State (National Capital of India). Further, we keep the scope of dates from 25th March to 6th June 2020. Data till 29th April has been used for Training, whereas from 30th April to 6th June has been used for testing/prediction. The test data has been used to predict for 7 days for 3 subsequent stages of prediction. This code demonstrates the data is first split into a 70:30 ratio between training and testing (by finding the closest number to 7), where each set is then restructured to weekly samples of data. def split_dataset(data): # split into standard weeks print(np.shape(data)) split_factor = int((np.shape(data)[0]*0.7)) print("Split Factor no is", split_factor) m = 7 trn_close_no = closestNumber(split_factor, m) te_close_no = closestNumber((np.shape(data)[0]-split_factor), m) train, test = data[0:trn_close_no], data[trn_close_no:(trn_close_no + te_close_no)] print("Initials Train-Test Split --", np.shape(train), np.shape(test)) len_train = np.shape(train)[0] len_test = np.shape(test)[0] # restructure into windows of weekly data train = array(split(train[0:len_train], len(train[0:len_train]) / 7)) test = array(split(test, len(test) / 7)) print("Final Train-Test Split --", np.shape(train), np.shape(test)) return train, test Initials Train-Test Split -- (49, 23) (21, 23) ----- Training and Test DataSet Final Train-Test Split -- (7, 7, 23) (3, 7, 23) ----- Arrange Train and Test DataSet into 7 and 3 weekly samples respecytively. The data set and the features have been scaled using Min-Max Scaler. scaler = MinMaxScaler(feature_range=(0, 1)) scaled_dataset = scaler.fit_transform(dataset) Convert Time-Series to a Supervised DataSet The tricky part in converting the time-series to a supervised time-series for multi-step prediction lies in incorporating the number of past days (i.e. the historic data) that the weekly data has to consider. The series derived by considering historic data is considered 7 times during training iterations and 3 times during testing iterations (as it got split as (7,7,23) and (7,3,23), where 22 is the number of input features with one predicted output). This series built using historic data helps the model to learn and predict any day of the week. Note 1: This is the most important step of formulating a time-series data to a multi-step model The below snippet code demonstrates what is described above. # convert history into inputs and outputs def to_supervised(train, n_input, n_out=7): # flatten data data = train.reshape((train.shape[0] * train.shape[1], train.shape[2])) X, y = list(), list() in_start = 0 # step over the entire history one time step at a time for _ in range(len(data)): # define the end of the input sequence in_end = in_start + n_input out_end = in_end + n_out # ensure we have enough data for this instance if out_end <= len(data): X.append(data[in_start:in_end, :]) y.append(data[in_end:out_end, 0]) # move along one time step in_start += 1 return array(X), array(y) Training Different Deep Learning Models using Tensorflow In this section, we describe how we train different DL models using Tensorflow’s Keras APIs. Convolution Neural Network (CNN Model) The following figure recollects the structure of a Convolution Neural Network (CNN) with a code snippet showing how a 1D CNN with 16 filters, with a kernel size of 3 has been used to train the network over 7 steps, where each 7 step is of 7 days. Source # train CNN model def build_model_cnn(train, n_input): # prepare data train_x, train_y = to_supervised(train, n_input) # define parameters verbose, epochs, batch_size = 0, 200, 4 n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1] # define model model = Sequential() model.add(Conv1D(filters=16, kernel_size=3, activation='relu', input_shape=(n_timesteps,n_features))) model.add(MaxPooling1D(pool_size=2))model.add(Flatten()) model.add(Dense(10, activation='relu')) model.add(Dense(n_outputs)) model.compile(loss='mse', optimizer='adam') # fit network model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) return model                                                                  CNN LSTM The following code snippet demonstrates how we train an LSTM model, plot the training and validation loss, before making a prediction. # train LSTM model def build_model_lstm(train, n_input): # prepare data train_x, train_y = to_supervised(train, n_input) print(np.shape(train_x)) print(np.shape(train_y)) # define parameters verbose, epochs, batch_size = 0, 50, 16 n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1] # reshape output into [samples, timesteps, features] train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1)) # define model model = Sequential() model.add(LSTM(200, activation='relu', input_shape=(n_timesteps, n_features))) model.add(RepeatVector(n_outputs)) model.add(LSTM(200, activation='relu', return_sequences=True)) model.add(TimeDistributed(Dense(100, activation='relu'))) model.add(TimeDistributed(Dense(1))) model.compile(loss='mse', optimizer='adam') # fit network model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) return model The below figure illustrates the Actual vs Predicted Outcome of the Multi-Step LSTM model after the predicted outcome has been inverse-transformed (to remove the effect of scaling).                                                                       LSTM Bi-Directional LSTM The following code snippet demonstrates how we train a BI-LSTM model, plot the training and validation loss, before making a prediction. Source # train Bi-Directionsl LSTM model def build_model_bi_lstm(train, n_input):# prepare data train_x, train_y = to_supervised(train, n_input) print(np.shape(train_x)) print(np.shape(train_y)) # define parameters verbose, epochs, batch_size = 0, 50, 16 n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]# reshape output into [samples, timesteps, features] train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1)) # define model model = Sequential() model.add(Bidirectional(LSTM(200, activation='relu', input_shape=(n_timesteps, n_features)))) model.add(RepeatVector(n_outputs)) model.add(Bidirectional(LSTM(200, activation='relu', return_sequences=True))) model.add(TimeDistributed(Dense(100, activation='relu'))) model.add(TimeDistributed(Dense(1))) model.compile(loss='mse', optimizer='adam') # fit network model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) return model The below figure illustrates the Actual vs Predicted Outcome of Multi-Step Bi-LSTM model after the predicted outcome has been inverse-transformed (to remove the effect of scaling). BI-LSTM Stacked LSTM + CNN Here we have used Conv1d with TimeDistributed Layer, which is then fed to a single layer of LSTM, to predicted different sequences, as illustrated by the figure below. The CNN model is built first, then added to the LSTM model by wrapping the entire sequence of CNN layers in a TimeDistributed layer. Source # train Stacked CNN + LSTM model def build_model_cnn_lstm(train, n_input): # prepare data train_x, train_y = to_supervised(train, n_input) # define parameters verbose, epochs, batch_size = 0, 500, 16 n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1] # reshape output into [samples, timesteps, features] train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1)) # define model model = Sequential() model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape=(n_timesteps, n_features))) model.add(Conv1D(filters=64, kernel_size=3, activation='relu')) model.add(MaxPooling1D(pool_size=2)) model.add(Flatten()) model.add(RepeatVector(n_outputs)) model.add(LSTM(200, activation='relu', return_sequences=True)) model.add(TimeDistributed(Dense(100, activation='relu'))) model.add(TimeDistributed(Dense(1))) model.compile(loss='mse', optimizer='adam') # fit network model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) return model The prediction and inverse scaling help to yield the actual predicted outcomes, as illustrated below.                                                           LSTM With CNN Multi-Step Forecasting and Evaluation The below snippet states how the data is properly reshaped into (1, n_input, n) to forecast for the following week. For the multi-variate time-series (of 23 features) with test data of 23 samples (with predicted output from previous steps i.e. 21+2) for 3 weeks is reshaped from (7,7,23), (8,7,23) and (9,7,23) as (49,23), (56,23) and (63, 23)  Prediction for 3 weeks, by taking the predicted output from previous weeks # make a forecast def forecast(model, history, n_input): # flatten data data = array(history) data = data.reshape((data.shape[0]*data.shape[1], data.shape[2])) # retrieve last observations for input data input_x = data[-n_input:, :] # reshape into [1, n_input, n] input_x = input_x.reshape((1, input_x.shape[0], input_x.shape[1])) # forecast the next week yhat = model.predict(input_x, verbose=0) # we only want the vector forecast yhat = yhat[0] return yhat Note 2: If you wish to see the evaluation results and plots for each step as stated below, please check the notebook at Github (https://github.com/sharmi1206/covid-19-analysis Notebook ts_dlearn_mstep_forecats.ipynb) Here at each step at the granularity of every week, we evaluate the model and compare it against the actual output. # evaluate one or more weekly forecasts against expected valuesdef evaluate_forecasts(actual, predicted):print("Actual Results", np.shape(actual)) print("Predicted Results", np.shape(predicted))scores = list() # calculate an RMSE score for each day for i in range(actual.shape[1]): # calculate mse mse = mean_squared_error(actual[:, i], predicted[:, i])# calculate rmse rmse = sqrt(mse) # store scores.append(rmse) plt.figure(figsize=(14, 12)) plt.plot(actual[:, i], label='actual') plt.plot(predicted[:, i], label='predicted') plt.title(ModelType + ' based Multi-Step Time Series Active Cases Prediction for step ' + str(i)) plt.legend() plt.show() # calculate overall RMSE s = 0 for row in range(actual.shape[0]): for col in range(actual.shape[1]): s += (actual[row, col] - predicted[row, col]) ** 2 score = sqrt(s / (actual.shape[0] * actual.shape[1])) return score, scores # evaluate a single model def evaluate_model(train, test, n_input): model = None # fit model if(ModelType == 'LSTM'): print('lstm') model = build_model_lstm(train, n_input) elif(ModelType == 'BI_LSTM'): print('bi_lstm') model = build_model_bi_lstm(train, n_input) elif(ModelType == 'CNN'): print('cnn') model = build_model_cnn(train, n_input) elif(ModelType == 'LSTM_CNN'): print('lstm_cnn') model = build_model_cnn_lstm(train, n_input) # history is a list of weekly data history = [x for x in train] # walk-forward validation over each week predictions = list() for i in range(len(test)): # predict the week yhat_sequence = forecast(model, history, n_input) # store the predictions predictions.append(yhat_sequence) # get real observation and add to history for predicting the next week history.append(test[i, :]) # evaluate predictions days for each week predictions = array(predictions) score, scores = evaluate_forecasts(test[:, :, 0], predictions) return score, scores, test[:, :, 0], predictions Here we show a univariate and multi-variate, multi-step time-series prediction. Multi-Step Conv2D + LSTM (Uni-variate & Multi-Variate) based Prediction for State Delhi Source A type of CNN-LSTM is the ConvLSTM (primarily for two-dimensional spatial-temporal data), where the convolutional reading of input is built directly into each LSTM unit. Here for this particular univariate time series, we have the input vector as [timesteps=14, rows=1, columns=7, features=2 (input and output)] # train CONV LSTM2D model def build_model_cnn_lstm_2d(train, n_steps, n_length, n_input): # prepare data train_x, train_y = to_supervised_2cnn_lstm(train, n_input) # define parameters verbose, epochs, batch_size = 0, 750, 16 n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1] # reshape into subsequences [samples, time steps, rows, cols, channels] train_x = train_x.reshape((train_x.shape[0], n_steps, 1, n_length, n_features)) # reshape output into [samples, timesteps, features] train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1)) # define model model = Sequential() model.add(ConvLSTM2D(filters=64, kernel_size=(1,3), activation='relu', input_shape=(n_steps, 1, n_length, n_features))) model.add(Flatten()) model.add(RepeatVector(n_outputs)) model.add(LSTM(200, activation='relu', return_sequences=True)) model.add(TimeDistributed(Dense(100, activation='relu'))) model.add(TimeDistributed(Dense(1))) model.compile(loss='mse', optimizer='adam') # fit network model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose) return model # convert history into inputs and outputs def to_supervised_2cnn_lstm(train, n_input, n_out=7): # flatten data data = train.reshape((train.shape[0]*train.shape[1], train.shape[2])) X, y = list(), list() in_start = 0 # step over the entire history one time step at a time for _ in range(len(data)): # define the end of the input sequence in_end = in_start + n_input out_end = in_end + n_out # ensure we have enough data for this instance if out_end <= len(data): x_input = data[in_start:in_end, 0] x_input = x_input.reshape((len(x_input), 1)) X.append(x_input) y.append(data[in_end:out_end, 0]) # move along one time step in_start += 1 return array(X), array(y) # make a forecast def forecast_2cnn_lstm(model, history, n_steps, n_length, n_input): # flatten data data = array(history) data = data.reshape((data.shape[0]*data.shape[1], data.shape[2])) # retrieve last observations for input data input_x = data[-n_input:, 0] # reshape into [samples, time steps, rows, cols, channels] input_x = input_x.reshape((1, n_steps, 1, n_length, 1)) # forecast the next week yhat = model.predict(input_x, verbose=0) # we only want the vector forecast yhat = yhat[0] return yhat # evaluate a single model def evaluate_model_2cnn_lstm(train, test, n_steps, n_length, n_input): # fit model model = build_model_cnn_lstm_2d(train, n_steps, n_length, n_input) # history is a list of weekly data history = [x for x in train] # walk-forward validation over each week predictions = list() for i in range(len(test)): # predict the week yhat_sequence = forecast_2cnn_lstm(model, history, n_steps, n_length, n_input) # store the predictions predictions.append(yhat_sequence) # get real observation and add to history for predicting the next week history.append(test[i, :]) # evaluate predictions days for each week predictions = array(predictions) score, scores = evaluate_forecasts(test[:, :, 0], predictions) return score, scores, test[:, :, 0], predictions Reading State-wise data and Indexing time columns: df_state_all = pd.read_csv('all_states/all.csv') df_state_all = df_state_all.drop(columns=['Latitude', 'Longitude', 'index']) stateName = unique_states[8] dataset = df_state_all[df_state_all['Name of State / UT'] == unique_states[8]] dataset = dataset.sort_values(by='Date', ascending=True) dataset = dataset[(dataset['Date'] >= '2020-03-25') & (dataset['Date'] <= '2020-06-06')] print(np.shape(dataset)) daterange = dataset['Date'].values no_Dates = len(daterange) dateStart = daterange[0] dateEnd = daterange[no_Dates - 1] print(dateStart) print(dateEnd) dataset = dataset.drop(columns=['Unnamed: 0', 'Date', 'source1', 'state', 'Name of State / UT', 'tagpeopleinquarantine', 'tagtotaltested']) print(np.shape(dataset)) n = np.shape(dataset)[0] scaler = MinMaxScaler(feature_range=(0, 1)) scaled_dataset = scaler.fit_transform(dataset) # split into train and test train, test = split_dataset(scaled_dataset) # define the number of subsequences and the length of subsequences n_steps, n_length = 2, 7 # define the total days to use as input n_input = n_length * n_steps score, scores, actual, predicted = evaluate_model_2cnn_lstm(train, test, n_steps, n_length, n_input) # summarize scores summarize_scores(ModelType, score, scores) The model parameters can be summarized as :   Model Summary Conv2D + LSTM The evaluate_model function appends the model forecasting score at each step and returns it at the end. The below figure illustrates the Actual vs Predicted Outcome of Multi-Step ConvLSTM2D model after the predicted outcome has been inverse-transformed (to remove the effect of scaling). Uni-Variate ConvLSTM2D For multi-variate time series with 22 input features and one output prediction, we take into consideration the following changes: In function forecast_2cnn_lstm we replace the input data shaping to constitute the multi-variate features #In function forecast_2cnn_lstm input_x = data[-n_input:, :]. #replacing 0 with : # reshape into [samples, time steps, rows, cols, channels] input_x = input_x.reshape((1, n_steps, 1, n_length, data.shape[1])) #replacing 1 with #data.shape[1] for multi-variate Further, in function to_supervised_2cnn_lstm, we replace x_input’s feature size from 0 to : and 1 with 23 features as follows: x_input = data[in_start:in_end, :] x_input = x_input.reshape((len(x_input), x_input.shape[1]))                                                    Multi-Variate ConvLSTM2D Conv2D + BI_LSTM We can further try out Bi-Directional LSTM with a 2D Convolution Layer as depicted in the figure below. The model stacking and subsequent layers remain the same as tried in the previous step, with the exception of using a BI-LSTM in place of a single LSTM. Source Comparison of Model Metrics on test data set Deep Learning Method RMSE LSTM 912.224 BI LSTM 1317.841 CNN 1021.518 LSTM + CNN 891.076 Conv2D + LSTM (Uni-Variate Single-Step) 1288.416 Conv2D + LSTM (Multi-Variate Multi-Step) 863.163 Conclusion In this blog, I have discussed multi-step time-series prediction using deep learning mechanisms and compared/evaluated them based on RMSE. Here, we notice that for a forecasting time-period of 7 days stacked ConvLSTM2D works the best, followed by LSTM with CNN, CNN, and LSTM networks. More extensive model evaluation with different hidden layers and neurons with efficient hyperparameter tuning can further improve accuracy. Though we see the model accuracy decreases for multi-step models, this can be a useful tool for having long term forecasts where predicted outcomes in the previous week help in playing a dominant role on predicted outputs. For complete source code check out https://github.com/sharmi1206/covid-19-analysis Acknowledgments Special thanks to machinelearningmastery.com. as some of the concepts have been taken from there. References https://arxiv.org/pdf/1801.02143.pdf https://github.com/sharmi1206/covid-19-analysis https://machinelearningmastery.com/multi-step-time-series-forecasting/ https://machinelearningmastery.com/multi-step-time-series-forecasting-with-machine-learning-models-for-household-electricity-consumption/ https://machinelearningmastery.com/how-to-develop-lstm-models-for-multi-step-time-series-forecasting-of-household-power-consumption/ https://machinelearningmastery.com/convert-time-series-supervised-learning-problem-python/ https://www.tensorflow.org/tutorials/structured_data/time_series https://www.aiproblog.com/index.php/2018/11/13/how-to-develop-lstm-models-for-time-series-forecasting/
Read more
  • 0
  • 0
  • 1813

article-image-types-of-variables-in-data-science-in-one-picture-from-featured-blog-posts-data-science-central
Matthew Emerick
18 Oct 2020
1 min read
Save for later

Types of Variables in Data Science in One Picture from Featured Blog Posts - Data Science Central

Matthew Emerick
18 Oct 2020
1 min read
While there are several dozen different types of possible variables, all can be categorized into a few basic areas. This simple graphic shows you how they are related, with a few examples of each type.  More info: Types of variables in statistics and research  
Read more
  • 0
  • 0
  • 1413

article-image-servicenow-partners-with-ibm-on-aiops-from-devops-com
Matthew Emerick
16 Oct 2020
1 min read
Save for later

ServiceNow Partners with IBM on AIOps from DevOps.com

Matthew Emerick
16 Oct 2020
1 min read
ServiceNow and IBM this week announced that the Watson artificial intelligence for IT operations (AIOps) platform from IBM will be integrated with the IT service management (ITSM) platform from ServiceNow. Pablo Stern, senior vice president for IT workflow products for ServiceNow, said once that capability becomes available later this year on the Now platform, IT […] The post ServiceNow Partners with IBM on AIOps appeared first on DevOps.com.
Read more
  • 0
  • 0
  • 3424

article-image-hans-juergen-schoenig-postgresql-sophisticating-temporary-tables-from-planet-postgresql
Matthew Emerick
16 Oct 2020
4 min read
Save for later

Hans-Juergen Schoenig: PostgreSQL: Sophisticating temporary tables from Planet PostgreSQL

Matthew Emerick
16 Oct 2020
4 min read
Temporary tables have been around forever and are widely used by application developers. However, there is more to temporary tables than meets the eye. PostgreSQL allows you to configure the lifespan of a temporary table in a nice way and helps to avoid some common pitfalls. CREATE TEMPORARY TABLE … By default, a temporary table will live as long as your database connection. It will be dropped as soon as you disconnect. In many cases this is the behavior people want: tmp=# CREATE TEMPORARY TABLE x (id int); CREATE TABLE tmp=# d List of relations Schema | Name | Type | Owner -----------+------+-------+------- pg_temp_3 | x | table | hs (1 row) tmp=# q iMac:~ hs$ psql tmp psql (12.3) Type "help" for help. tmp=# d Did not find any relations. Once we have reconnected, the table is gone for good. Also, keep in mind that the temporary table is only visible within your session. Other connections are not going to see the table (which is, of course, the desired behavior). This also implies that many sessions can create a temporary table having the same name. However, a temporary table can do more. The most important thing is the ability to control what happens on commit: [ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ] As you can see, there are three options. “PRESERVE ROWS” is the behavior you have just witnessed. Sometimes you don’t want that. It is therefore also possible to empty a temporary table on commit: tmp=# BEGIN; BEGIN tmp=# CREATE TEMP TABLE x ON COMMIT DELETE ROWS AS SELECT * FROM generate_series(1, 5) AS y; SELECT 5 tmp=# SELECT * FROM x; y --- 1 2 3 4 5 (5 rows) tmp=# COMMIT; COMMIT tmp=# SELECT * FROM x; y --- (0 rows) In this case, PostgreSQL simply leaves us with an empty table as soon as the transaction ends. The table itself is still around and can be used. Let us drop the table for now: tmp=# DROP TABLE x; DROP TABLE Sometimes you want the entire table to be gone at the end of the transaction: “ON COMMIT DROP” can be used to achieving exactly that: tmp=# BEGIN; BEGIN tmp=# CREATE TEMP TABLE x ON COMMIT DROP AS SELECT * FROM generate_series(1, 5) AS y; SELECT 5 tmp=# COMMIT; COMMIT tmp=# SELECT * FROM x; ERROR: relation "x" does not exist LINE 1: SELECT * FROM x; PostgreSQL will throw an error because the table is already gone. What is noteworthy here is that you can still use WITH HOLD cursors as shown in the next example: tmp=# BEGIN; BEGIN tmp=# CREATE TEMP TABLE x ON COMMIT DROP AS SELECT * FROM generate_series(1, 5) AS y; SELECT 5 tmp=# DECLARE mycur CURSOR WITH HOLD FOR SELECT * FROM x; DECLARE CURSOR tmp=# COMMIT; COMMIT tmp=# FETCH ALL FROM mycur; y --- 1 2 3 4 5 (5 rows) The table itself is still gone, but the WITH HOLD cursors will ensure that the “content” of the cursor will survive the end of the transaction. Many people don’t expect this kind of behavior, but it makes sense and can come in pretty handy. Controlling memory usage … If you are using temporary tables, it makes sense to keep them relatively small. In some cases, however, a temporary table might be quite large for whatever reason. To ensure that performance stays good, you can tell PostgreSQL to keep more of a temporary table in RAM. temp_buffers is the parameter in postgresql.conf you should be looking at in this case: tmp=# SHOW temp_buffers; temp_buffers -------------- 8MB (1 row) The default value is 8 MB. If your temporary tables are large, increasing this value certainly makes sense. Finally … If you want to find out more about PostgreSQL database performance in general, consider checking out my post about three ways to detect and fix slow queries. The post PostgreSQL: Sophisticating temporary tables appeared first on Cybertec.
Read more
  • 0
  • 0
  • 1412
article-image-why-its-time-for-site-reliability-engineering-to-shift-left-from-devops-com
Matthew Emerick
16 Oct 2020
1 min read
Save for later

Why It’s Time for Site Reliability Engineering to Shift Left from DevOps.com

Matthew Emerick
16 Oct 2020
1 min read
By adopting a multilevel approach to site reliability engineering and arming your team with the right tools, you can unleash benefits that impact the entire service-delivery continuum In today’s application-driven economy, the infrastructure supporting business-critical applications has never been more important. In response, many companies are recruiting site reliability engineering (SRE) specialists to help them […] The post Why It’s Time for Site Reliability Engineering to Shift Left appeared first on DevOps.com.
Read more
  • 0
  • 0
  • 2657

article-image-best-practices-for-managing-remote-it-teams-from-devops-com
Matthew Emerick
16 Oct 2020
1 min read
Save for later

Best Practices for Managing Remote IT Teams from DevOps.com

Matthew Emerick
16 Oct 2020
1 min read
With IT teams around the world adjusting to remote working while still struggling to maintain productivity and workflows, ensuring DevOps and Agile practices are in place has never been more important. And while typical DevOps professionals probably have significantly better remote desktop setups than most of their business peers, it’s how IT leaders manage these […] The post Best Practices for Managing Remote IT Teams appeared first on DevOps.com.
Read more
  • 0
  • 0
  • 2406

article-image-data-measured-in-terms-of-real-aggregate-value-from-devops-com
Matthew Emerick
16 Oct 2020
1 min read
Save for later

Data Measured in Terms of Real Aggregate Value from DevOps.com

Matthew Emerick
16 Oct 2020
1 min read
The post Data Measured in Terms of Real Aggregate Value appeared first on DevOps.com.
Read more
  • 0
  • 0
  • 1622
article-image-offer-your-apps-for-pre-order-even-earlier-from-news-apple-developer
Matthew Emerick
15 Oct 2020
1 min read
Save for later

Offer your apps for pre-order even earlier from News - Apple Developer

Matthew Emerick
15 Oct 2020
1 min read
Now you can let customers pre-order your app up to 180 days before it’s released for download on the App Store. Take advantage of this longer lead time to build more excitement for your app’s features, services, and content, and to encourage more pre-orders. Once your app is released, customers will be notified and it will automatically download to their device within 24 hours. Learn more about pre-orders
Read more
  • 0
  • 0
  • 2019

article-image-new-amazon-rds-on-graviton2-processors-from-aws-news-blog
Matthew Emerick
15 Oct 2020
3 min read
Save for later

New – Amazon RDS on Graviton2 Processors from AWS News Blog

Matthew Emerick
15 Oct 2020
3 min read
I recently wrote a post to announce the availability of M6g, R6g and C6g families of instances on Amazon Elastic Compute Cloud (EC2). These instances offer better cost-performance ratio than their x86 counterparts. They are based on AWS-designed AWS Graviton2 processors, utilizing 64-bit Arm Neoverse N1 cores. Starting today, you can also benefit from better cost-performance for your Amazon Relational Database Service (RDS) databases, compared to the previous M5 and R5 generation of database instance types, with the availability of AWS Graviton2 processors for RDS. You can choose between M6g and R6g instance families and three database engines (MySQL 8.0.17 and higher, MariaDB 10.4.13 and higher, and PostgreSQL 12.3 and higher). M6g instances are ideal for general purpose workloads. R6g instances offer 50% more memory than their M6g counterparts and are ideal for memory intensive workloads, such as Big Data analytics. Graviton2 instances provide up to 35% performance improvement and up to 52% price-performance improvement for RDS open source databases, based on internal testing of workloads with varying characteristics of compute and memory requirements. Graviton2 instances family includes several new performance optimizations such as larger L1 and L2 caches per core, higher Amazon Elastic Block Store (EBS) throughput than comparable x86 instances, fully encrypted RAM, and many others as detailed on this page. You can benefit from these optimizations with minimal effort, by provisioning or migrating your RDS instances today. RDS instances are available in multiple configurations, starting with 2 vCPUs, with 8 GiB memory for M6g, and 16 GiB memory for R6g with up to 10 Gbps of network bandwidth, giving you new entry-level general purpose and memory optimized instances. The table below shows the list of instance sizes available for you: Instance Size vCPU Memory (GiB) Dedicated EBS Bandwidth (Mbps) Network Bandwidth (Gbps) M6g R6g large 2 8 16 Up to 4750 Up to 10 xlarge 4 16 32 Up to 4750 Up to 10 2xlarge 8 32 64 Up to 4750 Up to 10 4xlarge 16 64 128 4750 Up to 10 8xlarge 32 128 256 9000 12 12xlarge 48 192 384 13500 20 16xlarge 64 256 512 19000 25 Let’s Start Your First Graviton2 Based Instance To start a new RDS instance, I use the AWS Management Console or the AWS Command Line Interface (CLI), just like usual, and select one of the db.m6g or db.r6ginstance types (this page in the documentation has all the details). Using the CLI, it would be: aws rds create-db-instance --region us-west-2 --db-instance-identifier $DB_INSTANCE_NAME --db-instance-class db.m6g.large --engine postgres --engine-version 12.3 --allocated-storage 20 --master-username $MASTER_USER --master-user-password $MASTER_PASSWORD The CLI confirms with: { "DBInstance": { "DBInstanceIdentifier": "newsblog", "DBInstanceClass": "db.m6g.large", "Engine": "postgres", "DBInstanceStatus": "creating", ... } Migrating to Graviton2 instances is easy, in the AWS Management Console, I select my database and I click Modify. The I select the new DB instance class: Or, using the CLI, I can use the modify-db-instance API call. There is a short service interruption happening when you switch instance type. By default, the modification will happen during your next maintenance window, unless you enable the ApplyImmediately option. You can provision new or migrate to Graviton2 Amazon Relational Database Service (RDS) instances in all regions where EC2 M6g and R6g are available : US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Ireland), and Europe (Frankfurt) AWS Regions. As usual, let us know your feedback on the AWS Forum or through your usual AWS contact. -- seb
Read more
  • 0
  • 0
  • 4681